Note: This repository contains the official code implementation for the paper "EHHN: An Event-driven Heterogeneous Hypergraph Network for Object-Centric Next Activity Prediction" .
This project proposes an event-driven Heterogeneous Hypergraph Learning approach specifically designed for Object-Centric Event Logs (OCEL) to tackle the Next-Activity Prediction task in process mining.
The core code and datasets are organized as follows:
data/: Directory for storing raw datasetsnewTry/: Core implementation code- Data Analysis & Preprocessing
ana.py: Script for exploring/analyzing raw datasetsconstruct_PE.py: Helper functions for feature constructionpreprocess.py: Helper functions for general preprocessingpipeline_OTC.py: Dedicated pipeline for OTC datasetpipline_2017.py: Dedicated pipeline for BPI 2017 datasetpipline_inter.py: Dedicated pipeline for Inter datasetpipline_p2p.py: Dedicated pipeline for P2P dataset
- Model Architecture
OCELhg.py: Custom core structure (OCEL Heterogeneous Hypergraph)encoder.py: Encoder modulemodel.py: Overall model architecture
- Training & Evaluation
Trainer.py: Model training scripttest.py: Model testing and evaluation script
- Configuration & Utilities
config.py: Configuration fileutils.py: General utility functions
- Data Analysis & Preprocessing
We recommend using Conda to create a virtual environment:
conda create -n ocel_hg python=3.8
conda activate ocel_hgInstall required packages:
pip install torch torchvision torchaudio --index-url [https://download.pytorch.org/whl/cu118](https://download.pytorch.org/whl/cu118)
pip install torch_geometric pandas numpy pm4pyPlace your raw OCEL datasets into the data/ directory. Ensure the filenames match the paths specified in your preprocessing pipelines.
Before running any scripts, open newTry/config.py to configure your settings (Target Dataset, Hyperparameters, Paths, etc.).
We provide customized preprocessing pipelines for each dataset. Run the specific pipeline tailored to your dataset to clean data and construct the hypergraph.
For the OTC dataset:
python newTry/pipeline_OTC.pyFor other datasets:
python newTry/pipline_2017.py
python newTry/pipline_inter.py
python newTry/pipline_p2p.py(Tip: If you are introducing a new dataset, run newTry/ana.py first to analyze the raw data structure.)
Once the preprocessing is complete, you can start training the model:
python newTry/Trainer.pyAfter the model finishes training, run the test script to evaluate its performance:
python newTry/test.py-
OCELhg.py: Maps object-centric event logs into a heterogeneous hypergraph structure to capture complex many-to-many relationships. -
Customized Pipelines: Emphasizes refined feature engineering tailored to different business contexts.
-
OCELhg.py: Maps object-centric event logs into a heterogeneous hypergraph structure to capture complex many-to-many relationships. -
Customized Pipelines: Emphasizes refined feature engineering tailored to different business contexts.