Process Oriented Data Science

Course title

Process Oriented Data Science

Faculty

Josep Carmona, jcarmona@cs.upc.edu. Professor titular del Departament de Ciències de la Computació. Universitat Politècnica de Catalunya. Member of the ALBCOM research group.

Course language

English

Course schedule

July 13 - 17 from 4:00pm to 7:00pm

Description

Nowadays organizations need to adapt to the big data scenario. This goes beyond data storage and analysis, reaching process analysis as well. Process mining bridges the gap between traditional model-based process analysis and data-centric analysis techniques such as machine learning and data mining. Process mining techniques confront event data representing the observed behavior to process models. Such process models can be made manually or discovered automatically via process mining techniques, and are crucial for analyzing the operational dimension in an organization.
Example applications of process mining include: analysis of treatments in a hospital, improving customer service processes in a multinational, analyzing failures of a baggage handling system, and many more.


The course explains the key analysis techniques of process mining, ranging from process model discovery down to conformance analysis and process enhancement. Together with learning the important techniques, students will also learn modern tools for putting process mining in practice.


The evolution of information systems in the last decade led to an exponential growth of digital process-related data and hence these type of systems are one of the main drivers behind the arising of data science. Processes are the main reason information systems exist, e.g., “the sale process”, “the hiring process”, “the treatment process”, etc. Process models are a key element in the development and evaluation of any information system, since a model can be used not only in the specification stage, but also for reasoning about the system itself at any stage in an introspective manner, e.g., to detect problems or deviations, perform quantitative/qualitative analysis, among others. Correctness is an important factor, since an analysis done on the basis of an informal model may derive wrong conclusions. It is therefore crucial to rely on formal models for which mathematical techniques can be used to derive accurate information.


The analysis of event data denoting the footprints of process executions enables organizations and individuals to extract real value from process-related data. This course introduces the main techniques to accomplish this ambitious goal. The student will acquire the intuition behind the techniques, and will learn how to apply them on real-life data using the available tool support. No previous knowledge is required.

Program

D1: PROCESS MINING: DISCOVERY, CONFORMANCE AND EXTENSION (3h)
  • Motivation
  • Basics on process model formalisms
  • Basics on data and process mining - Some examples
D2: ALGORITHMS FOR PROCESS DISCOVERY (3h)
  • A selection of algorithms
  • Decompositional methods for discovery in big data
  • Successful applications in real-life
  • Tool support (ProM, PMLAB, Disco)
D3: ALGORITHMS FOR CONFORMANCE CHECKING (3h)
  • Metrics for conformance checking
  • Algorithms
  • Decompositional methods for conformance in big data
  • Successful applications in real-life
  • Tool support (ProM)
D4: PROCESS EXTENSION AND OPERATIONAL SUPPORT (3h)
  • Performance
  • Organizational mining
  • Decision mining
  • On-line techniques
  • Process prediction
  • Tool support (ProM and Disco)
D5: PROJECT AND APPLICATIONS (3h)
  • Healthcare
  • Municipalities
  • Software

Evaluation

Students will present the last day a project on the application of process mining on real data using the available tools.

Classroom

PC3