Image Description

Process Mining Software

We have learned about the different types of process mining and the kind of data we need to get started. When loading an event log in XES format into one of the leading process mining tools, one can immediately discover processes and check conformance. It is also possible to load event data in the form of a Comma Separated Value file (called CSV file), an excel file, or a database table. However, when loading, for example, a CSV file, one first needs to configure the import function of the tool. As mentioned before, the biggest initial hurdle is to extract the data from information systems like SAP. After this, one can choose one of the over 35 commercial process mining tools or use one of the open-source tools.

When I started to work on process mining in the late 1990-ties, we were developing a separate tool for each process mining algorithm. For example, I developed the first implementation of the Alpha algorithm in ExSpect, an executable specification language, around the turn of the century. We soon realized that it would be better to have a common open-source framework to develop process-mining techniques sharing common functionalities like loading an event log and showing a discovered process model. This lead to the development of the process mining framework ProM. In 2004, the first fully functional version of the ProM framework was released. This version contained 29 plug-ins. Today, ProM provides over 1500 plug-ins supporting all four types of process mining. For example, there are dozens of process discovery algorithms implemented in ProM. ProM is still the de facto standard in the academic world. However, many other open-source tools and commercial tools have been developed inspired by ideas first implemented in ProM. Take, for example, Celonis, which implemented the directly-follows graph, conformance checking using token-based replay, token-animation based on event data, inductive process discovery, and predictive analytics, all first developed in ProM.

When considering open-source tools, there are several alternatives, building on different platforms next to the ProM framework, which is Java-based. RapidProM is based on the RapidMiner infrastructure. PM4KNIME builds on the KNIME platform. BupaR is an open-source, integrated suite of R-packages. PM4Py is the leading open-source process-mining platform written in Python.

Next to these open-source process mining tools, there are over 35 commercial products. Examples are Celonis, Disco, UiPath process mining, Apromore, QPR, Myinvenio, PAFnow, Everflow, Mehrwerk process mining, Minit, Abby Timeline, Lana Labs, Signavio Process Intelligence, Skan, Process Diamond, and ARIS Process Mining. The current market leader is Celonis, which offers process mining software covering the whole process lifecycle. Next to the four types of process mining described before, Celonis can be used to create easy-to-use dashboards and automatically trigger improvement workflows to address conformance and performance problems. Most of the tools have unique capabilities and target different groups of users. Some tools are deliberately lightweight and only create directly-follows graphs annotated with times and frequencies. Tools like Signavio and ARIS combine a business process modeling environment with process mining, and therefore focus more on conformance checking than discovery. UiPath and Skan focus on process mining as a supporting tool for Robotic Process Automation (RPA). Apromore, Everflow, and Lana provide more advanced process discovery and conformance checking techniques. Also, among the commercial tools, we see solutions that build upon existing platforms. For example, PAFnow uses Microsoft Power BI, and Mehrwerk process mining builds upon the Qlik infrastructure. These later tools are particularly interesting for organizations already using such a platform.

The over 35 commercial tools have very different price tags, whereas the open-source tools are for free. The price of the software may also depend on the number of processes or the number of events. There are also differences in scalability. Therefore, it is good to build up expertise first and play with some of the tools.

Next to tools and access to data, one need to embed process mining in the organization. The next lesson discusses how to approach a process mining project using the L* lifecycle model.

Image Description
Written by

Wil van der Aalst