user-icon Timo Freiberg
18. March 2021
timer-icon 4 min

The Case for the Typestate Pattern - The Typestate Pattern itself

In the previous article, I showed two equivalent implementations of the same program to compare different approaches of encoding state in types.

Now, I’ll show the typestate pattern.

Quick refresher

The data structure of the previous version looked like this:


And there was one issue I had with that approach: Every function that expected a particular state had to perform the same annoying pattern matching:


To improve that, we can give each state its own type. But instead of having separate NewRepairOrder, InvalidRepairOrder (etc) types, we can use the state both as a field and as a type parameter. This changes the corresponding types into RepairOrder<New> and RepairOrder<Invalid>. The data structures look quite similar to the previous version:


Very similar, but now the states are not part of the same type but their own separate types. Let’s look at how some of the state-transitioning functions look:


So the state is now visible in the types, allowing us to directly access its data without any further checks. The type signatures also make all state transitions very visible, which could make complicated state machines more readable.

You might have noticed that I don’t set the state via self.state = new_state, instead using a helper function, which looks like this:


This is unfortunately necessary to change the state type parameter1.

The main function still only consists of state transitions, but now the return values of all the functions need to be used for each next step.


I criticised the verbosity of the previous version, but surely having to give intermediate steps a name is fine? Well, we actually don’t have to, because returning the next state gave us a fluent API. Using method chaining shortens the function drastically:


I had to change the function signature slightly to use the ? operator, but I think this only makes it more realistic.

The method chaining might not be everyones cup of tea, but I actually think this is the prettiest version so far.

Pros

  • All pros of the previous version
  • More specific type signatures
  • Allows method chaining
    • Ok this is a bit unfair, the previous version could have also offered a fluent API
      It’s very natural in this version, though
  • No boilerplate state unpacking with match state {...}

Con

  • One boilerplate state transitioning function is required

Sounds awesome, should I use this pattern everywhere now?

Well, it depends. I would argue that for this specific algorithm I made up, the typestate pattern would be a good idea.

In other situations, it might be a very bad idea though. I’m going to give some examples where other approaches work much better in the next article.

Only in Rust? (Or Haskell?)

I have mostly heard of the typestate pattern in the Rust and Haskell community. But the examples in this post are easily translatable into Kotlin, which I’m going to show in a future article.

Further reading

I can recommend these articles:

http://cliffle.com/blog/rust-typestate/ – a more in-depth look at the way Rust’s type system helps representing state.

https://chrisdone.com/posts/path-package/ – the motivation behind an unusually type-safe path handling library, written in Haskell (which blew my mind at the time).


The full example code is available here.

1

The same verbose workaround is required in Kotlin. Haskell does not have the same issue, but then there’s no in-place mutation in Haskell. There’s a Rust RFC in progress to improve this exact interaction.

Comment article

Comments

  1. Arthur

    Nice post, Timo!

    The only problem I have with this pattern is persistence. Let me elaborate.

    Your example is synchronous. But imagine that every step of it is async and might take days to finish. So every step of the process should be persistable. And that is not a problem, the problem is how to get that object back since you don’t know which type of object is save to db you can’t know what kind of object will be restored back which means there is no way to write a single method to get it from storage. In short: while working with object is pretty nice, safe and straightforward it’s almost impossible to work with it when it should be saved to db.

    Am I missing something here?

    • Freiberg Timo

      Hi Arthur, glad you liked it!

      That’s in fact the first example I could think of where the pattern has obvious drawbacks.
      I wrote a longer response to a similar question in the reddit comments here: https://www.reddit.com/r/rust/comments/m7nox4/the_case_for_the_typestate_pattern_the_typestate/grf0sle/?utm_source=reddit&utm_medium=web2x&context=3
      So no, I don’t think you’re missing something here.
      I think there are ways to have these kinds of serialization requirements and still use the typestate pattern, but it becomes a lot less attractive.
      But if the state transitions are extremely subtle and making the input and output states visible in the type signatures helps a lot with correctness and readability, it might still be worth it.