Model Pipeline Overview
Model Pipeline
The ModelPipeline class trains and evaluates forecasting models with rolling-window validation.
Basic Usage
from epftoolbox2.pipelines import ModelPipelinefrom epftoolbox2.models import OLSModel, LassoCVModelfrom epftoolbox2.evaluators import MAEEvaluatorfrom epftoolbox2.exporters import ExcelExporter, TerminalExporter
pipeline = ( ModelPipeline() .add_model(OLSModel(predictors=predictors, name="OLS")) .add_model(LassoCVModel(predictors=predictors, name="LassoCV")) .add_evaluator(MAEEvaluator()) .add_exporter(TerminalExporter()) .add_exporter(ExcelExporter("results.xlsx")))
report = pipeline.run( data=df, test_start="2024-02-01", test_end="2024-03-01", target="price", horizon=7, save_dir="results",)Pipeline Components
Models
OLSModel Ordinary Least Squares
LassoCVModel Lasso with cross-validation
Evaluators
MAEEvaluator Mean Absolute Error
Exporters
TerminalExporter Console output
ExcelExporter Excel file
Predictor Specification
Predictors can be specified in four ways:
predictors = [ "load_actual", "is_monday_d+{horizon}", "is_tuesday_d+{horizon}", "is_wednesday_d+{horizon}", "is_thursday_d+{horizon}", "is_friday_d+{horizon}", "is_saturday_d+{horizon}", "is_sunday_d+{horizon}", "is_holiday_d+{horizon}", "daylight_hours_d+{horizon}",]Use {horizon} placeholder for horizon-dependent features:
predictors = [ "load_actual", "warsaw_temperature_2m_d+{horizon}",]For complex patterns:
predictors = [ "load_actual", lambda h: f"weather_d+{h}",]Use list comprehensions for many features:
predictors = [ "load_actual", *[f"load_actual_h-{i}" for i in range(1, 169)], *[f"price_d-{i}" for i in range(1, 8)],]Feature Scaling
Models automatically apply StandardScaler:
-
Numeric features are standardized (mean=0, std=1)
-
Binary features (0/1) are auto-detected and skipped
-
Target variable is scaled, predictions are inverse-scaled
Run Parameters
| Parameter | Type | Default | Description |
|---|---|---|---|
data | DataFrame | Required | Input data with DatetimeIndex |
test_start | str | Required | Test period start |
test_end | str | Required | Test period end |
target | str | "price" | Target column name |
horizon | int | 7 | Max forecast horizon (days) |
save_dir | str | None | Directory for incremental results |
EvaluationReport
The pipeline returns an EvaluationReport:
report.summary() # Overall metricsreport.by_hour() # Breakdown by hourreport.by_horizon() # Breakdown by horizonreport.by_hour_horizon() # Combined breakdown