Efficient alignment-based conformance checking of procedural and declarative process models

Jacobo Casas Ramos

Ayuda

Efficient alignment-based conformance checking of procedural and declarative process models

Autores: Jacobo Casas Ramos
Directores de la Tesis: Manuel Mucientes Molina (dir. tes.) , Manuel Lama Penín (dir. tes.)
Lectura: En la Universidade de Santiago de Compostela ( España ) en 2025
Idioma: inglés
Tribunal Calificador de la Tesis: Fabio Patrizi (presid.) , Juan Carlos Vidal Aguiar (secret.) , Pedro Javier Alvarez Pérez Aradros (voc.)
Enlaces
- Tesis en acceso abierto en: MINERVA
Resumen
- The field of Process Mining has gained significant attention in recent years due to its ability to provide valuable insights into business processes. One of the main challenges in Process Mining is conformance checking, which focuses on comparing the behavior observed in event logs with a process model in order to identify and assess deviations and inconsistencies. Among the most widely used and informative techniques for this task is the computation of optimal alignments, as it allows relating the actual behavior recorded in event logs with the expected behavior defined by process models, even when no perfect match exists. Conformance checking is crucial for improving process efficiency and ensuring regulatory compliance.
  
  In conformance checking, an alignment is a sequence of moves that reconciles the events of a trace extracted from the event log with a valid execution of the process model. The goal is to find the optimal alignment, i.e., the sequence of moves with the lowest total cost that best explains how the observed behavior conforms to (or deviates from) the model. The search process for obtaining optimal alignments involves exploring a state space, where each state represents a partial alignment. For this task, algorithms based on A* search are commonly used, as they can find optimal solutions by guiding exploration through heuristic functions. However, the state of the art in conformance checking often faces difficulties in efficiently computing optimal alignments due to the complexity of the search space and the limitations of the algorithms employed.
  
  The hypothesis guiding this research is that exploring various optimization techniques, including advanced A* heuristics, novel search-space pruning strategies, and problem reformulations, can significantly improve the efficiency of computing optimal alignments. This research proposes new techniques to efficiently align event logs with procedural, declarative, and data-aware declarative process models. The main contributions include three specialized algorithms for each of these modeling paradigms: REACH, DeclareAligner, and DADA.
  
  REACH leverages A* search and optimization techniques to enhance performance in computing optimal alignments between event logs and procedural models. REACH introduces a novel heuristic that detects required activities to accelerate the search. In addition, it applies pruning techniques to remove unnecessary states from the search space, such as State Reduction forcing asynchronous Model moves (SRModel) and State Reduction forcing asynchronous Log moves (SRLog), thereby reducing the computational complexity associated with process model execution. The algorithm also employs a Partial Reachability Graph (PRG) to accelerate process model execution.
  
  DeclareAligner extends A* search to address the computation of optimal alignments for declarative process models, leveraging the flexibility of these models and introducing several optimization strategies. The main optimizations include: focusing on actions that directly repair violated constraints; proposing a heuristic that takes into account the suggested actions for all violated constraints in a state to provide accurate cost estimates; pruning states that cannot lead to an optimal alignment; preprocessing the problem before search starts; and performing multiple operations simultaneously to avoid intermediate states.
  
  DADA enables the computation of optimal alignments for declarative models with data conditions. To this end, it integrates two complementary strategies: an A* search mechanism to explore the space of possible alignments, and Satisfiability Modulo Theories (SMT) solving to handle both control-flow and data constraints. The main optimizations include: prioritizing repairs of violations that generate fewer successor states; detecting dead-end states by precomputing all violations and checking whether repairs are applicable; exploiting SMT assumptions to avoid redundant work; and eliminating inconsistent states through SMT checks.
  
  The evaluation of these algorithms demonstrates their efficiency, showing significant improvements in computation time and in the number of problems solved compared with the state of the art in conformance checking. This research contributes to the development of efficient algorithms for computing optimal alignments capable of handling large and complex event logs and process models across different paradigms, paving the way for a new generation of Process Mining techniques.