Installation

Install the package
Terminal window
pip install epftoolbox2
Terminal window
uv add epftoolbox2
Terminal window
git clone https://github.com/dawidlinek/epftoolbox2.git cd epftoolbox2 pip install -e .
Verify installation
```
import epftoolbox2

epftoolbox2.verify()
```
Configure for parallel processing (optional but recommended)

See the section below for GIL-free Python setup.

Requirements

Python 3.10+ (Python 3.14t recommended for parallel processing)
pandas, numpy, scikit-learn
requests (for API sources)
openpyxl (for Excel export)
rich (for terminal output)

Parallel Model Training Python 3.13t+

Models run in parallel using a process pool with inner thread pools - one worker process per group of cores, each using multiple threads internally. This avoids GIL contention and BLAS oversubscription regardless of how many cores you have.

┌─ Process 1 ─────────────┐  ┌─ Process 2 ─────────────┐
│  16 threads             │  │  16 threads             │
│  processes days 0..N    │  │  processes days N..M    │
│  BLAS threads = 1       │  │  BLAS threads = 1       │
└─────────────────────────┘  └─────────────────────────┘

For best results use Python 3.14t (free-threading build), which removes the GIL and allows true parallel numpy execution within each process.

uv python install 3.14t
uv venv --python 3.14t

pyenv install 3.14.0t
pyenv local 3.14.0t

Script Setup

Place this block at the very top of your script:

import os

# Enable free-threading (Python 3.13t+)
os.environ["PYTHON_GIL"] = "0"

# Pin BLAS to 1 thread per process — prevents oversubscription
os.environ["OMP_NUM_THREADS"] = "1"
os.environ["MKL_NUM_THREADS"] = "1"
os.environ["OPENBLAS_NUM_THREADS"] = "1"

# Parallelism configuration
# MAX_PROCESSES × THREADS_PER_PROCESS should equal your physical core count
os.environ["MAX_PROCESSES"] = "4"         # worker processes
os.environ["THREADS_PER_PROCESS"] = "16"  # threads per process

# Then import everything else
from epftoolbox2.pipelines import DataPipeline, ModelPipeline
from epftoolbox2.models import OLSModel, LassoCVModel

Environment Variables

Variable	Default	Description
`MAX_PROCESSES`	`cpu_count // THREADS_PER_PROCESS`	Number of worker processes
`THREADS_PER_PROCESS`	`16`	Threads per worker process (also accepts `MAX_THREADS`)
`OMP_NUM_THREADS`	system default	OpenMP threads per process — set to `1`
`MKL_NUM_THREADS`	system default	MKL threads per process — set to `1`
`OPENBLAS_NUM_THREADS`	system default	OpenBLAS threads per process — set to `1`

Platform Notes

Linux / Mac
Windows

Uses fork — worker processes inherit the parent’s memory directly. No if __name__ == '__main__': guard needed in your scripts.

Uses spawn — worker processes start fresh Python interpreters. Scripts that call model.run() at top level must be wrapped:

if __name__ == '__main__':
    report = model.run(...)