Master’s Project: Diving into Process Mining

From now on I will use this blog to also talk about my Master’s project. It’s a project which is necessary to complete my Master in Information Management, worth 15 ECTS out of the 60 for the whole Master. The Master’s thesis is then composed/written out of that project. I’ll try to update this blog as much as I can.

Process Mining

My project is about process mining. What exactly is process mining? Well, according to the process mining manifesto, it is:

 a relatively young research discipline that sits between computational intelligence and data mining on the one hand, and process modeling and analysis on the other hand.

And the Idea of it is:

The idea of process mining is to discover, monitor and improve real processes (i.e., not assumed processes) by extracting knowledge from event logs readily available in today’s (information) systems.

So it’s about algorithms which are able to extract knowledge from event logs. More simple: process mining are techniques used in order to discover new processes, monitor existing processes and even improve processes.

We’re more accustomed to the idea of data mining, which is extracting knowledge from data warehouses. The focus was always on data until around the 90’s, when things like process re-engineering started emerging, processes started becoming important as well. Process mining is filling the gap between Business Intelligence and Business Process Management.

The tool

Today I was playing around with the tool used for process mining, called ProM. ProM gives you a framework where you can use process mining tools. I also followed an introductory tutorial for using the tool.

ProM enables you to use an event log in order to do various actions using plug-ins. These plug-ins can be categorized into three:

  • Discovery: These plug-ins only take the event log as an input. They answer questions like: How are the cases actually being executed? and are the rules being obeyed?
  • Conformance: These plug-ins check how much the data in the event log matches the prescribed behavior from deployed process models. They help you monitor processes.
  • Extension: Discover information that will enhance the model, they take a model and the event log as input.

A cool example of the plug-ins that I used today is the alpha-algorithm. It mines a petri net out of the event log.  I used the event log which is given in the tutorial, it’s about a process to repair telephones in a company.

In the image above you see the key data on the event log. 1 process, 1000 instances of that process, accumulating 10476 events, a mean of 10 events per instance.

In the image above you see the action tab of ProM, Inputted is the event log I used, then in Actions I searched for the alpha-algorithm, which was installed in a more general plug-in.

Above you see the result of the alpha algorithm. So it actually makes a process model out of the event log, very cool. More detailed: the alpha-algorithm mines the control flow perspective of a process. From the tutorial:

The control- flow perspective of a process establishes the dependencies among its tasks. Which tasks precede which other ones? Are there concurrent tasks? Are there loops? In short, what is the process model that summarizes the flow followed by most/all cases in the log? This information is important because it gives you feedback about how cases are actually being executed in the organization.

So with this model you can check whether in your company, the process is actually being done how you thought it was, or does the model show that things are done differently? So this plug-in falls under the first category, i.e. “Discovery”. It answers the question: How are the cases actually being executed?