The manual configuration of server landscapes poses several challenges for IT professionals, for example due to the ever-increasing complexity and the associated demands on IT infrastructure.
The Infrastructure as Code (IaC) approach offers an alternative, in which servers are provisioned, configured and managed by machine-readable scripts, for example.
IaC can be defined as the formal description of infrastructures, such as servers, databases, network components, etc. In addition, the Infrastructure as Code approach can be implemented using a variety of software tools/frameworks.
NOTE: Due to the fact that one could fill several books with the topic “What is and what comprises IT infrastructure” alone, we have decided to focus on servers as placeholders of infrastructure in this article. It does not matter whether it is a bare-metal, containerised or virtualised server.
Where does Infrastructure as Code come from?
“Only those who know the past can understand the present and shape the future” – August Bebel
Let’s travel back in time 15-20 years and take a look at software companies like Adobe, Atlassian or Facebook. From today’s perspective, very successful companies. But they also started small once. To be able to sell their digital products, they first operated individual servers and got by with little hardware. Gradually, they built smaller data centres. Initially self-operated, they later bought the administration and hardware from other service providers. Over the years, the server landscape in the data centres grew steadily in order to be able to guarantee stable operation for a growing number of users. Many completely underestimate how many servers are necessary. For example, in 2009, just 5 years after the launch of their social network, Facebook announced that they were running more than 30,000 servers!
Provisioning, configuration & deployment used to be quite slow
Even if a company only needs to deploy a few dozen servers, organisational challenges quickly arise. Hardware has to be ordered and installed in the server rooms when scaling up. Establishing a reliable network connection is just as important. Thanks to cloud providers, such as AWS, Azure or Google, the effort is much smaller today. Servers are accessible via the internet within minutes of being ordered. They can also be connected to virtual networks, load balancers, etc. by configuration.
But after providing the bare server, you are not done yet. Usually, each server has to be set up for a specific purpose: The configuration includes, for example, user administration, the installation of dependencies, runtime environments and other tasks to operate software applications on top.
Once a server is ready, the actual software can be installed. Today, such a deployment usually happens automatically. At least the DevOps movement has made it clear that such process steps should ideally be fully automated.
Deployments both of software and servers are not one-off or infrequent activities. In a larger landscape, old servers are constantly being removed, new ones added or existing ones adapted. This creates a natural life cycle of servers.
Automation to prevent errors and configuration drift
The steps described above are carried out for each new server. However, the operation of the “old” servers and the further development of the products should of course continue. Due to the considerable effort for server setup, without IaC, modifying or reusing old servers instead of creating a new one can be tempting. It can easily happen that a “test machine” is turned into a new “productive instance”. Sometimes runtime environments have to be changed urgently or other tooling has to be installed.
Such undocumented ad-hoc changes always lead to problems in the long run. This is called “configuration drift”, i.e. when the server’s standard configuration is deviated from. Due to the long running time of the servers, more and more old burdens are gradually accumulated. Step by step, these servers mutate into so-called snowflake servers – each one unique and therefore almost impossible to maintain:
Due to this fact, the Phoenix server pattern established itself, which wants to put an end to this circumstance. “Like the Phoenix from the ashes”, a server is to be virtually burnt down and recreated at regular intervals:
However, this would be extremely time-consuming without automating the infrastructure. In the past, attempts were made to realize this automation using shell scripts. However, it quickly became clear that such scripts are not the right tool due to their high complexity and poor portability. In 1993, the first IaC tool was developed with the “CFEngine” project. CFEngine offers an interface that is independent of the operating system and thus abstracts the differences between the various Linux distributions. With its declarative and domain-specific description language, the tool simplifies the configuration of servers immensely. Today, many similar solutions exist, as the following timeline shows:
All these tools attempt to simplify and accelerate while increasing code quality and readability. Many tackle this difficult endeavour with the help of principles and practices from software development. The tools specialize in different phases, which we will explain in more detail below.
The different phases
When applying IaC, a distinction is made between two phases. Each phase covers specific work:
- Initial Setup Phase
- Provisioning of the infrastructure
- Configuration of the infrastructure
- Initial installation of software
- Initial configuration of software
- Maintaining Phase
- Adjusting the infrastructure
- Removal and addition of components
- Software Updates
- Reconfiguration of software
To abstract it a bit, we will talk about the initial infrastructure setup and the initial application setup and management. This covers both phases completely. Depending on the approach and tool, the focus is more on the infrastructure or the application side. Docker images, for example, are used more as deployables rather than as an infrastructure component, which is why it is located on the right-hand side in the following image. The graphic can always look different depending on the perspective and is not meant as something absolute.
The different types of IaC
There are many different types of IaC. Each has its own advantages and disadvantages. So depending on the application, it is important to discover which tools and methods can generate the greatest added value. In the following, we present the most common types and demonstrate them with a small code snippet.
The most basic way to automate something is to write a script. In doing so, the steps of the otherwise manually performed task are written in the preferred script language and then executed in the target environment. The following bash script installs a web server and starts it.
# Update Package Manager
sudo apt-get update
# Install Apache
sudo apt-get install -y apache2
# Start Apache
sudo service apache2 start
Popular scripting languages:
Configuration Management Tools
Configuration management tools are designed to install and manage software on existing servers. For example, here is an Ansible role that configures the same Apache web server as the Bash script above:
- hosts: apache
- name: install apache2
apt: name=apache2 update_cache=yes state=latest
Popular configuration management tools:
An alternative to configuration management tools are templating tools, such as Docker, Packer and Vagrant. Rather than launching a series of instances and configuring them by running the same code on each one, the idea behind templating tools is to create an image. This “snapshot” of the operating system, software and any files can thus be delivered as a standalone artefact in the form of an image. Here is an example of a Dockerfile as a template for an Ubuntu-based image for a web server:
RUN apt-get -y update && \
apt-get install -y apache2
CMD ["-D", "FOREGROUND"]
Templating tools are great for creating VMs and containers, but how can you manage many of them efficiently? This is where tools like Kubernetes, Amazon ECS, Docker Swarm or Nomad come into play. Since these tools are very complex and provide an enormous range of functionality, we will not go into more detail about how they work here. The behaviour of a Kubernetes cluster can be defined by code. This includes, for example, how your Docker containers should be executed, how many instances should be kept running and how to proceed with a rollout.
Provisioning tools like Terraform, AWS CloudFormation and Pulumi are mainly meant for describing and creating cloud infrastructure. In fact, you can use them to create not only servers, but also caches, load balancers, firewall settings, routing rules and pretty much any other aspect of an IT infrastructure. Often configuration management or templating tools and provisioning tools intertwine. For example, Terraform can create a VM that is then set up with Puppet.
Choosing the right IaC toolchain is not easy, as there is no one-size fits all solution. The advantages of established configuration management, provisioning and/or orchestration are apparent. While even small projects benefit greatly from the IaC approach, it is not always worth employing powerful orchestration tools for small projects. If you have a clearly defined goal and want to introduce one of the above-mentioned types of IaC in your project, you can easily work your way through the requirements and step by step find the best setup.
With this overview, it should be easier for you to make the right choice for your project. If you still have questions, please ask them in the comments. You are also welcome to present your current setup in the comments, we are very curious to know which and why you are using your particular setup.