Your challenges with classic maintenance strategies

Do you wonder how to minimize downtime for your machines and equipment and maximize availability? Which of the maintenance strategies makes the most sense? And how you can optimize your plant efficiency with preventive maintenance?

There are many reasons to digitize your processes: Maintenance is a critical factor for the success of a company, as maintenance costs account for up to 40% of the total costs [Eick, Reichel, Schmidt: Instandhaltung des Kapitalstocks in Deutschland: Rolle und volkswirtschaftliche Bedeutung; 2011]. Many companies are still pursuing a reactive or preventive strategy. Machine learning (ML) processes allow for significantly better maintenance strategies:

  • In reactive maintenance, measures are taken only after the malfunction occurs, which can cause long downtimes and thus production losses. Because machines are operated up to the limits of their service life, the risk of catastrophic damage increases enormously.
  • Preventive maintenance pursues a goal of replacing components before they malfunction. Here, components are replaced regardless of their condition, based mostly on a time interval. Increased or reduced stress on the machine is not taken into account. But, variances in the stress on the machine do have an impact on the wear, so a maintenance measure can become necessary significantly sooner or later than usual based on utilization. This makes it difficult to determine the right time for a maintenance measure. Spontaneous production losses or premature measures are the result. The availability of spare parts and logistical aspects also contribute to the complexity of planning. Here, a maintenance measure must be coordinated and planned with production planning, technicians, and partially with the customer service of the machine manufacturer as well.
  • In condition-based maintenance, the actual state of wear of the machine is used instead of a time interval. It is registered by sensors and then maintenance measures are determined from simple rules such as the exceeding or undershooting of threshold values. This reduces spontaneous equipment malfunctions.
  • Predictive maintenance relies on the condition-based approach and uses machine learning methods to learn patterns in historical sensor data, which accurately predicts the remaining service life of machines. A prerequisite for this is an IIoT platform that stores all the data centrally in one location and allows for comprehensive analyses. This way, the service life of machines and components is fully utilized without producing defects in product quality or malfunctions.
  • In the final stage of maturity, recommendations for planning maintenance are made based on predictive maintenance. Here, the objective is to keep downtimes and costs as low as possible and to prevent subsequent malfunctions. In addition, spare parts and maintenance staff must be available. On this basis, maintenance jobs can be bundled and performed to achieve the best possible use of resources.

Maturity Degree Model for Maintenance Strategies.
Source: Novatec internal

The Advantages of Using Predictive Maintenance

The purpose of maintenance is to ensure the technical availability of production equipment and machines and to reduce unplanned, technically related shutdowns. In preventive maintenance, the time intervals for maintenance measures are set a bit shorter than the average operating time between two malfunctions. This results in maintenance measures that often are not yet necessary at the time. While maintenance is being performed, equipment and machines cannot be used. Because maintenance measures as well as production losses are associated with costs, it makes greater economic sense to carry out as many maintenance measures as necessary, but at the same time as few as possible. The purpose of predictive maintenance is to reduce the downtime of equipment and machines for technical reasons. Maintenance measures are to be carried out exactly when they are unavoidable – when a shutdown is imminent. The aim is to increase the time between two specific maintenance measures and reduce technically related downtime.

The advantages of using predictive maintenance are:

  • Insight into how machines are used and how they are subject to wear
  • Enhanced planning security
  • Maximized production time for machines
  • Development of new business models through the provision of machine data

Predictive maintenance uses sensor data (e.g. temperature, vibration, tribology, acoustics, imaging, etc.), event data from IT systems, process parameters from machine controllers, and machine learning to predict the probability of impending shutdowns. The Industrial Internet of Things (IIoT) allows process parameters to be captured from equipment and machines. Large volumes of data are frequently stored and processed in cloud environments. Recorded process parameters, the associated operating states of the equipment and machines, as well as the technical reasons for shutdowns are used as learning examples. Machine learning is then used to learn patterns in the process parameters, the associated operating states, and the reasons for shutdown. Next, the process parameters recorded by sensors are continuously monitored in production. If a learned pattern is detected in the process parameters, maintenance is informed of the grounds for an imminent shutdown and the predicted time until it occurs. Variances in the process parameters that cannot be attributed to a familiar pattern are used to inform maintenance of an anomaly. Then, maintenance can provide information in order to learn new operating states and reasons for shutdown as well as the associated patterns.

When predictive maintenance is used, a distinction can generally be made between two application scenarios

  • Predictive maintenance can be used in your own production to avoid unplanned, technically related shutdowns.
  • As a manufacturer service for customers that purchase the manufactured machines and use them in their company. Then an unplanned, technically related shutdown will affect the customer’s production, and could impact the manufacturer’s reputation if there are frequent malfunctions. As a manufacturer service, the process parameters of the equipment and machines can be displayed in a service portal to point to imminent shutdowns. An error message can include information on the cause and important information on the maintenance measure as well as any spare parts required and how to order them directly. The manufacturer service gives every customer an advantage because the data is collected and evaluated by the manufacturer. This way, patterns in the sensor data of a machine in which a technical fault has occurred for the first time can then be recognized by other customers as well.

Predictive maintenance addresses the fundamental problem of finding the right moment to take action: an optimization problem in order to be more cost efficient. This method can also be applied to similar issues. For example, waste containers are normally emptied in a fixed rhythm of 2 to 4 weeks. In some cases they are emptied too early and the container is not very full. In other cases, additional waste bags are already lying next to the container on the street. With the right data (e.g. weight of the container when it is being emptied), container sizes and routes of the disposal company could be tailored to the type of customer and planned in advance. This can increase efficiency and conserve resources, saving costs for disposal companies and customers. An approach that is currently being applied as part of smart city initiatives for public waste containers.

How Predictive Maintenance Works

Monitoring the condition of the equipment and machines is essential for predictive maintenance. The convergence of operational technology (OT) and information technology (IT), as well as the availability of bandwidth, computing capacity, and memory, make it possible to collect, store, and analyze large volumes of data. To this end, different data is collected by the appropriate sensors and read out from machine controllers and other IT systems. The following data, among other information, is used to evaluate the condition of the equipment and machines:

  • Vibrations (e.g. deflection, speed, acceleration, or ultrasound)
  • Temperature (e.g. component temperature, ambient temperature, infrared radiation)
  • Tribology (e.g. wear particles)
  • Event data (e.g. state of production, error messages)
  • Process parameters (rotational speed, processing time)

IIoT platforms such as Amazon AWS IoT, Microsoft Azure IoT, or Siemens MindSphere, collect the data from the sensors and transmit it to a cloud environment for storage and analysis. Access to the sensors is frequently provided by OPC UA – open platform communications unified architecture, a data exchange standard for industrial communication – or special IIoT connectors. A predictive maintenance environment, as in this example of a machine tool, can be structured as follows:

Exemplary Predictive Maintenance Architecture for a Machine Tool.
Source: Novatec internal

The architecture shown in the illustration gives an overview of components and technologies that are used in the predictive maintenance of a machine tool. Architecture and technologies are based on the cloud native open source software approach. This allows for operation in your own computer center or with AWS, Azure, and additional cloud providers. Technologies can also be substituted by managed services by AWS and Azure. For example, Amazon SageMaker provides all the components used for machine learning in a single tool set. The concrete implementation of the architecture is always based on the circumstances and requirements of the respective project.

Production (shop floor)

The machine tool is located in production. Utilization of the machine tool for production causes wear on the components and tools of the machine tool. A worn cutter can result in more energy being required for the forward feed function, higher temperatures, and the work step taking more time. A worn-out guide of a spindle is noticeable because of the noise and vibrations. Results include greater wear of other components and a reduction in the number of workpieces produced. Increased temperatures during milling can also be caused by insufficient coolant. Modern machine tools are equipped with a number of sensors that supply measured values such as temperature, pressure, vibrations, and filling levels as data. In addition, further data, such as the rotational speed of a spindle or the energy consumption of electric motors, can be obtained from the equipment and machine control units. Sensor data is collected from an IIoT platform by OPC UA and transmitted to a database in a cloud environment. Depending on the circumstances, sensor data can be transmitted to one or more customers. For security reasons, data transmission mostly follows the push principle and is initiated from production.

The Cloud

All data transmitted from the various machine tools is stored centrally in a database. This allows for a comprehensive analysis beyond individual machine tools for entire series. In addition, process data from an ERP or PPS system is stored in the database. This includes data such as the utilization or processing time of a workpiece. The data memory is selected based on the requirements (e.g. scalability, speed, resource allocation, in-memory) of the IoT strategy with regard to the processing of large volumes of data. The data is collected over a set time period in preparation for the training phase. During this time, visualizations, for example in the form of dashboards, are used to acquire an initial benefit from the data.


Once there is enough data in the database to train a machine learning model, the development of this data is started. From the training data, the model learns to predict the remaining utilization time until a technically related shutdown of the machine tool occurs. To this end, the data in the database first undergoes a preprocessing. The preprocessing extracts relevant sensor data, among other things, and performs a normalization or standardization of the data. Then the data is broken down within the database:

  • Training data consists of examples from which a machine learning model learns patterns in the data and derives regularities.
  • Validation data is used to perform a fine adjustment of the parameters of the model for an optimal prediction of the remaining utilization time.
  • Test data is used to evaluate the quality of the model’s predictions.

A machine learning model is trained to predict the remaining utilization time based on the training data. The trained model is evaluated based on the test and validation data, thus ensuring its quality. Then the model is deployed for use in production with current machine tool data. The training process is a cycle that is repeated over time whenever circumstances change: This could be new equipment, new sensors, or new data. For this reason, a high degree of automation is important. An accurate prediction can be made only if the machine learning model is familiar with all the patterns in the data and is always up to date.


The model deployed from the training process allows a prediction to be made of the remaining utilization time of the machine tool until a maintenance measure is required. Current machine tool data also undergoes preprocessing when the model is in operation. The preprocessing of current data ensures that the ML model can process it correctly. With the current data and the patterns learned by training, the machine learning model makes an accurate prediction of when the machine tool will malfunction. The prediction is saved back in the database as an output. This way, all relevant sensor data, process data, and the prediction can be visualized and monitored in a dashboard such as Grafana. Customer service portals can consume the data and integrate and display it in the portal. Rules for notifications can be created, and persons can be alerted if a malfunction is imminent and no measures have been planned yet. In the best case scenario, the planning of maintenance measures is automated as an optimization problem.

The technical implementation of the training as well as the prediction pipeline consists of decoupled, scalable, and interchangeable microservices that can be implemented with docker containers regardless of the platform, for example. This allows individual services to be written in different programming languages because uniform communication via protocols such as HTTP or REST ensures cooperation. The microservices are integrated into an existing or new cluster. A machine learning framework such as TensorFlow or Keras is used to compile and train deep neural networks according to allocation, evaluate the results, and deploy models. A tool such as DVC (data version control) is used for the versioning of data and models, as well as for pipelining, to enable reproducibility, maintainability, and traceability for the training cycle.

Our Predictive Maintenance Service

If data is already collected by an IIoT platform, we usually proceed in four steps for a predictive maintenance project:

As a first step, a joint workshop is held to record the actual status. It is important for us to understand whether you use predictive maintenance in your own production or whether you want to offer it to the customers of the machines you produce. We would like to understand which different data and sources exist, how large the volume of data is that is produced daily, and how often the data is updated for process parameters. The most important thing is to clarify whether currently there even exists any data with examples that can be used. This information allows us to make an initial estimate of the quality of the data and the frequency with which a new version of the model must be trained. We are also interested in knowing which existing (IIoT) platform is used, how the architecture is structured, which peripheral systems (e.g. ERP systems, BPM systems) are present, and which systems must be integrated. A target architecture will then be defined together with the technology strategy followed – e.g. open-source first strategy or managed-cloud service first strategy. Based on these principles, we will develop the objectives with you.

As a second step, we evaluate the quality of the data by means of an exploratory data analysis with regard to suitability and ability to predict the condition of the machine. We use a representative data set provided by you for this purpose. This is followed by a prototypical implementation of one or more machine learning (ML) models under consideration as well as their evaluation and documentation as part of a feasibility study (proof of value). This will produce a qualified decision recommendation including opportunities and risks. In addition, findings and insights from the existing data can be revealed, and relationships and correlations can be visualized.

In the third step, the predictive maintenance solution is implemented based on the architecture developed and the objectives. The solution is implemented depending on whether the company uses a cloud native application or managed services of a cloud provider (e.g. AWS or Azure). A holistic machine learning solution that comprises both a training phase and an inference phase is created. This includes a machine learning pipeline for the lifecycle of the machine learning model and the associated versioning of data and models. We attach great importance to automation and scaling in our predictive maintenance solution. In order to guarantee the availability of the production system and ensure a smooth operation, the services are monitored – in terms of the operational aspects (e.g. response time, number of calls, use of memory, or CPU utilization) as well as the qualitative aspects of the prediction (e.g. accuracy). From this information a conclusion can be made as to when a new model should be trained and how the application should scale automatically with the load.

In the fourth step, the enablement of the DevOps engineers, the specialist areas, and the data scientists takes place. We familiarize your employees with the technologies and methods in workshops or training sessions. Here it doesn’t matter whether it is just a technical enablement or if we are teaching your employees the basics of machine learning. What is important to us is that you receive the greatest added value possible for yourself and your customers from the predictive maintenance solution.

If connectivity has not yet been established and an IIoT platform is not yet available, we will gladly take care of that and expand the scope of the project accordingly to include additional steps. Have a look at our IoT program.

Your direct contact

Dr. Harald Bosch

Senior Consultant
Table of contents
Dr. Harald Bosch Senior Consultant