Skip to content

Installation

Installation

  1. Install the package

    Terminal window
    pip install epftoolbox2
  2. Verify installation

    import epftoolbox2
    epftoolbox2.verify()
  3. Configure for parallel processing (optional but recommended)

    See the section below for GIL-free Python setup.


Requirements

  • Python 3.10+ (Python 3.14t recommended for parallel processing)
  • pandas, numpy, scikit-learn
  • requests (for API sources)
  • openpyxl (for Excel export)
  • rich (for terminal output)

Parallel Model Training Python 3.13t+

Models run in parallel using a process pool with inner thread pools - one worker process per group of cores, each using multiple threads internally. This avoids GIL contention and BLAS oversubscription regardless of how many cores you have.

┌─ Process 1 ─────────────┐ ┌─ Process 2 ─────────────┐
│ 16 threads │ │ 16 threads │
│ processes days 0..N │ │ processes days N..M │
│ BLAS threads = 1 │ │ BLAS threads = 1 │
└─────────────────────────┘ └─────────────────────────┘

For best results use Python 3.14t (free-threading build), which removes the GIL and allows true parallel numpy execution within each process.

Installing Python 3.14t

Terminal window
uv python install 3.14t
uv venv --python 3.14t

Script Setup

Place this block at the very top of your script:

import os
# Enable free-threading (Python 3.13t+)
os.environ["PYTHON_GIL"] = "0"
# Pin BLAS to 1 thread per process — prevents oversubscription
os.environ["OMP_NUM_THREADS"] = "1"
os.environ["MKL_NUM_THREADS"] = "1"
os.environ["OPENBLAS_NUM_THREADS"] = "1"
# Parallelism configuration
# MAX_PROCESSES × THREADS_PER_PROCESS should equal your physical core count
os.environ["MAX_PROCESSES"] = "4" # worker processes
os.environ["THREADS_PER_PROCESS"] = "16" # threads per process
# Then import everything else
from epftoolbox2.pipelines import DataPipeline, ModelPipeline
from epftoolbox2.models import OLSModel, LassoCVModel

Environment Variables

VariableDefaultDescription
MAX_PROCESSEScpu_count // THREADS_PER_PROCESSNumber of worker processes
THREADS_PER_PROCESS16Threads per worker process (also accepts MAX_THREADS)
OMP_NUM_THREADSsystem defaultOpenMP threads per process — set to 1
MKL_NUM_THREADSsystem defaultMKL threads per process — set to 1
OPENBLAS_NUM_THREADSsystem defaultOpenBLAS threads per process — set to 1

Platform Notes

Uses fork — worker processes inherit the parent’s memory directly. No if __name__ == '__main__': guard needed in your scripts.