Incremental Training

When training models over long test periods, epftoolbox2 can save intermediate results and resume if interrupted.

Enabling Incremental Training

report = pipeline.run(
    data=df,
    test_start="2024-01-01",
    test_end="2024-12-31",
    save_dir="results",  # Enable incremental saving
)

How It Works

Each task (date × hour × horizon × model) is saved after completion
If the script is interrupted, re-running loads completed tasks
Only missing tasks are computed

Cache Directory Structure

Directoryresults/
- ols.jsonl
- lasso.jsonl
- model_name.jsonl

Results File Format

Each line in results.jsonl is a JSON object:

{"run_date": "2024-02-01", "target_date": "2024-02-02", "hour": 1, "horizon": 1, "day_in_test": 0, "prediction": 92.08884109589042, "actual": 58.0, "coefficients": [0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0]}
{"run_date": "2024-02-01", "target_date": "2024-02-02", "hour": 0, "horizon": 1, "day_in_test": 0, "prediction": 92.33908219178082, "actual": 58.62, "coefficients": [0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0]}