Process mining bridges the gap between traditional model-based process analysis (e.g., simulation and other business process management techniques) and data-centric analysis techniques such as machine learning and data mining. Process mining provides a new means to improve processes in a variety of application domains. The omnipresence of event data combined with process mining allows organizations to diagnose problems based on facts rather than fiction. Over 35 commercial process mining tools, and open-source tools like our ProM tool, support a wealth of techniques, and we will hear about these in later lessons.
Process mining starts with event data. Events correspond to things happening in a system or an organization, for example, placing an order, sending a tweet, making a CT scan, transferring money, or unloading a container. Each event refers to a case and an activity and has a timestamp. An event may have additional attributes such as costs, value, weight, customer, location, etc. However, the three mandatory attributes of an event are case, activity, and timestamp. Example cases are orders, applications, patients, and packages. When making process models, we also call these process instances. The steps in the process are executed with the goal to handle a particular case. The activity attribute of an event refers to a particular step in the process; for example, the activity "approve the application." The timestamp of an event marks when it occurred. We may have timestamps with millisecond precision, but sometimes we only have a date.
An event log is a collection of events corresponding to a process. Each case defines a trace. This is a sequence of events corresponding to the cases ordered based on the timestamps. Now that we know about the event data, we can define the four types of process mining.
The first type of process mining is process discovery. A discovery technique takes an event log and produces a model without using any a-priori information. The process model describes the process in terms of its activities. The process model can be a BPMN model, a Petri net, or a simple directly-follows graph. Such discovered process models have a visual representation showing the order in which activities can be executed. The model may show sequences of activities, concurrent activities, choices, skipping, and loops. For people that have never seen this before, it is very surprising that process mining is able to reconstruct the real processes based on event data. The results are often astonishing, and responses are comparable to people "seeing snow for the first time." The initial insights immediately trigger ideas for process improvement based on facts.
The second type of process mining is conformance checking. Here, an existing process model is compared with an event log of the same process. Conformance checking can be used to check if reality, as recorded in the log, conforms to the model and vice versa. For instance, there may be a process model indicating that purchase orders of more than one million Euro require two checks. Analysis of the event log will show whether this rule is followed or not. Another example is the checking of the so-called "four-eyes" principle, stating that particular activities should not be executed by one and the same person. By scanning the event log using a model specifying these requirements, one can discover potential cases of fraud and many other compliance problems. Hence, conformance checking can be used to detect, locate, and explain deviations, and to measure the severity of these deviations.
The third type of process mining is enhancement. Here, the idea is to extend or improve an existing process model using information about the actual process recorded in some event log. Whereas conformance checking measures the alignment between model and reality, this third type of process mining aims at changing or extending the a-priori model. One type of enhancement is repair, i.e., modifying the model to better reflect reality. Another type of enhancement is to extend the model with detailed information about bottlenecks.
The fourth type of process mining is operational support. These are forward-looking forms of process mining exploiting the process models created and extended using the first three types of process mining. An example is predicting the remaining processing time of a running case. Other examples are predicting deviations, predicting bottlenecks, and detecting unintended process changes. The insights generated using process mining need to be translated into actions, e.g., sending notifications, temporarily rerouting cases, changing the process design, and automatically starting workflows.
This brief explanation of the four types of process mining should give an idea of the broad scope of the field, and that process mining truly bridges the gap between traditional process management and mainstream data science techniques such as statistics, machine learning, and AI.
The next two lessons will focus on the first type of process mining: process discovery.