Process mining starts with extracting event data from any type of information system. Such event data can be transformed into multisets of traces. Each trace refers to a case and is described by a sequence of activities. In a hospital, it is possible to extract for each patient a trace that describes the different diagnostic and treatment activities. In a car factory, it is possible to derive a trace for each car produced describing the sequence of activities performed. Because it is impossible to describe a realistic event log in words, let us assume a simple artificial event log with 100 cases where 50 cases follow trace ABCE, 30 cases follow trace ACBE, and 20 cases follow ADE. Each case starts with activity A and ends with activity E. For 80 cases, both B and C occur in between these start and end activities. For 20 cases, activity D is performed instead of B and C. Such a process cannot be described by a simple direct-follows graph. One needs concurrency to describe the process properly. Alternatively, one can replicate activities, but this leads to more complex models. Higher-level notations such as Petri nets, BPMN diagrams, process trees, and UML Activity diagrams can describe such processes without any problems. Over the last 20 years, various discovery techniques have been developed able to discover such process models.

The α-algorithm was the first process discovery algorithm that could adequately deal with concurrency. However, the α-algorithm is not very practical: it has problems with noise, infrequent behavior, and complex routing constructs. Nevertheless, the algorithm served as the basis for many other approaches.

The α-algorithm transforms an event log into a Petri net able to reproduce the event log. The algorithm learns a so-called footprint matrix describing the relation between every pair of activities. Based on this, it tries to find a Petri net model that has the same footprint matrix. This is not always possible, but for a well-defined class of processes, it is guaranteed that the correct process model is returned. This algorithm is able to discover sequential behavior, choices, concurrency, and loops. However, it has, for example, difficulties with the skipping of activities. The α-algorithm is a member of a large collection of bottom-up discovery techniques. It is easy to explain these in terms of Petri nets. Petri nets were the first process model able to capture concurrency. A Petri net has two elements: transitions corresponding to activities and places corresponding to constraints. A Petri net without places and having a transition per activity is able to reproduce any trace found in the event log. Petri net places can be seen as constraints that limit the behavior of a process model. A place is constraining the behavior of the transitions connected to it, and process discovery can be reduced to the problem of finding such places. Bottom-up discovery techniques like the α-algorithm find suitable constraints for subsets of activities. Other examples of bottom-up approaches are techniques based on state-based regions and language-based regions.

Next to the bottom-up discovery techniques, there are also top-down approaches like the family of inductive mining techniques. These techniques recursively partition the event log into smaller event logs until each log contains events related to one activity. For example, if we can split the activities into two sets, X and Y such that activities in Y are never followed by activities in X, then we can split the data into a sublog with just X activities and a sublog with just Y activities. The process models learned for these two sublogs are connected using the sequence operator. If we can split the activities into two sets, X and Y such that traces have either just X activities or just Y activities, then we can split the data into two sublogs whose models are combined using a choice operator. The exact conditions and set of operators are not important. The key idea is that one recursively splits an event log into a smaller event log until the event log is trivial. Such a divide and conquer approach leads to hierarchical process models that can be expressed in terms of Petri nets, BPMN diagrams, process trees, UML Activity Diagrams, or statecharts. Inductive process mining techniques are supported by open-source tools such as ProM and PM4Py and commercial systems such as Celonis.

To date, many process discovery techniques have been proposed ranging from the classical α-algorithm to the inductive mining techniques. However, the problem of process discovery is not a solved problem, and people using process mining should be aware of the limitations of the different techniques. Filtered directly-follows graphs can be simple, but also very misleading if one does not understand how these are created. Therefore, it is good to read my book "Process Mining: Data Science in Action" or take one of the online courses. As Einstein said: "Everything should be made as simple as possible, but no simpler." This also applies to process discovery.

In the next lesson, we shift the focus from process discovery to process compliance. We will discuss the different conformance checking techniques that reveal differences between modeled behavior and the actual process represented by the event log.