Pipeline Serialization
Pipeline Serialization
Save and load pipeline configurations for reproducibility.
Save Pipeline
from epftoolbox2.pipelines import DataPipelinefrom epftoolbox2.data.sources import EntsoeSource, CalendarSourcefrom epftoolbox2.data.transformers import ResampleTransformer
pipeline = ( DataPipeline() .add_source(EntsoeSource(country_code="PL", api_key="...", type=["load"])) .add_source(CalendarSource(country="PL")) .add_transformer(ResampleTransformer(freq="1h")))
# Save to YAMLpipeline.save("pipeline_config.yaml")Load Pipeline
from epftoolbox2.pipelines import DataPipeline
pipeline = DataPipeline.load("pipeline_config.yaml")df = pipeline.run(start="2024-01-01", end="2024-06-01")Configuration Format
sources: - type: EntsoeSource country_code: PL api_key: ${ENTSOE_API_KEY} type: - load - price - type: CalendarSource country: PL holidays: binary
transformers: - type: ResampleTransformer freq: 1h
validators: - type: NullCheckValidator columns: - load_actual - priceEnvironment Variables
Use ${VAR_NAME} syntax to reference environment variables:
sources: - type: EntsoeSource api_key: ${ENTSOE_API_KEY}