Key Concepts#

Time Series Terminology#

Term	Description	Config Field
Lookback window	Number of past time steps fed to the model as input	`DataPipelineConfig.lookback_window_size`
Forecast horizon	Number of future time steps to predict	`DataPipelineConfig.forecast_horizon`
Period	Sampling frequency (pandas offset alias, e.g. `"1H"`, `"30min"`)	`DataPipelineConfig.period`
Target feature	The variable(s) to forecast	`DataPipelineConfig.target_feature`
Historical features	Features whose future values are unknown (lookback only)	`DataPipelineConfig.historical_features`
Calendar features	Cyclical temporal features derived from the timestamp (e.g. hour, day of week)	`DataPipelineConfig.calendar_features`
Exogenous features	External features known over the full lookback + forecast horizon	`DataPipelineConfig.exogenous_features`
Future covariates	External features known only over the forecast horizon	`DataPipelineConfig.future_covariates`

Feature availability across the time axis#

The four feature types differ in which portion of the time axis they cover.

        %%{init: {'theme': 'base', 'themeVariables': {'primaryColor': '#e0f8f4', 'primaryTextColor': '#263238', 'primaryBorderColor': '#0f718e', 'lineColor': '#0f718e', 'clusterBkg': '#f0fdf9', 'clusterBorder': '#00bfa5', 'titleColor': '#263238'}}}%%
graph LR
    subgraph PAST["Lookback Window  ( t-L … t )"]
        TP["target_feature\nas lagged input"]
        HF["historical_features\npast only"]
        CP["calendar_features\nderived from timestamp"]
        EP["exogenous_features\nfull window known"]
    end
    subgraph FUTURE["Forecast Horizon  ( t+1 … t+H )"]
        TF["target_feature\npredicted output"]
        CF["calendar_features\nderived from timestamp"]
        EF["exogenous_features\nfull horizon known"]
        FC["future_covariates\nhorizon only"]
    end
    PAST -->|"t → t+1"| FUTURE

    classDef target fill:#0f718e,stroke:#0f718e,color:#fff,rx:6
    classDef hist   fill:#263238,stroke:#263238,color:#fff,rx:6
    classDef cal    fill:#00bfa5,stroke:#00897b,color:#fff,rx:6
    classDef exog   fill:#e0f8f4,stroke:#0f718e,color:#263238,rx:6
    classDef fcov   fill:#f0fdf9,stroke:#00bfa5,color:#263238,rx:6

    class TP,TF target
    class HF hist
    class CP,CF cal
    class EP,EF exog
    class FC fcov

Feature type	Lookback	Forecast horizon	Typical examples
`target_feature`	Used as lagged input	Predicted output	Electricity load, solar generation
`historical_features`	Available	Not available	Sensor readings without NWP forecast
`calendar_features`	Derived from timestamp	Derived from timestamp	Hour of day, day of week, month
`exogenous_features`	Available	Available (full horizon)	NWP weather forecast, scheduled output
`future_covariates`	Not used	Available (horizon only)	Day-ahead price signal, planned events

Why the distinction matters

historical_features can only contribute lag/rolling statistics — their future values are unknown. exogenous_features and future_covariates are passed directly into the forecast window so the model conditions on their future values. calendar_features are always derivable from the timestamp and computed automatically.

Data Format Requirements#

Twiga expects a pandas DataFrame with:

A datetime column named "timestamp" by default (configurable via date_column)
One or more target columns — the variable(s) to forecast
Optional feature columns — any combination of the four feature types

import pandas as pd

df = pd.DataFrame({
    "timestamp":   pd.date_range("2024-01-01", periods=1000, freq="1h"),
    "load_mw":     [...],   # target_feature
    "temperature": [...],   # exogenous_features  — NWP forecast known for full horizon
    "wind_speed":  [...],   # future_covariates   — known only over forecast horizon
    "irradiance":  [...],   # historical_features — no future forecast available
    # calendar features (hour, dayofweek, etc.) are derived automatically from timestamp
})

The config tells the pipeline which columns play which role:

from twiga.core.config import DataPipelineConfig

data_config = DataPipelineConfig(
    target_feature="load_mw",
    period="1h",
    lookback_window_size=168,
    forecast_horizon=48,
    historical_features=["irradiance"],
    calendar_features=["hour", "dayofweek"],
    exogenous_features=["temperature"],
    future_covariates=["wind_speed"],
)

Note

The DataFrame must be sorted by timestamp with a regular frequency. Handle missing values before passing data to the pipeline.

Configuration-Driven Design#

Twiga follows a configuration-as-code pattern. Every component is configured via a Pydantic dataclass that validates inputs at construction time. The three core configs are:

DataPipelineConfig#

Controls data preprocessing — what features to engineer, how to scale, and how to create sequences.

from twiga.core.config import DataPipelineConfig

data_config = DataPipelineConfig(
    target_feature="load_mw",
    period="1h",
    lookback_window_size=168,           # 7 days of hourly data
    forecast_horizon=48,                # predict 2 days ahead
    historical_features=["irradiance"], # past-only, no future forecast
    calendar_features=["hour", "dayofweek"],
    exogenous_features=["temperature"], # known over full horizon
    future_covariates=["wind_speed"],   # known only for forecast window
    lags=[1, 24, 48, 168],
    windows=[24, 48],
    window_funcs=["mean", "std"],
)

See Configuration System for the full field reference.

ForecasterConfig#

Controls training orchestration — backtesting splits, project naming, and output directories.

from twiga.core.config import ForecasterConfig

train_config = ForecasterConfig(
    split_freq="months",
    train_size=6,
    test_size=1,
    gap=0,
    window="expanding",
    project_name="MyProject",
    seed=42,
)

Model Configs#

Each model has its own config class inheriting from BaseModelConfig (ML/baseline) or NeuralModelConfig (NN):

from twiga.models.ml.xgboost_model import XGBOOSTConfig

xgb_config = XGBOOSTConfig(
    device="cpu",
    random_state=42,
)

Model Domains#

Twiga organizes models into three domains:

Domain	Base Class	Training Framework	Models
`"baseline"`	`BaseRegressor`	scikit-learn API	Naive, SeasonalNaive, WindowAverage, Drift, ContextParrot
`"ml"`	`BaseRegressor`	scikit-learn API	CatBoost, XGBoost, LightGBM, RandomForest, LinearReg, NGBoost variants, QR variants
`"nn"`	`BaseNeuralForecast`	PyTorch Lightning	MLPF, MLPGAM, MLPGAF, N-HiTS, GANF and their probabilistic variants

The domain is set automatically from the model config’s domain field and controls how TwigaForecaster handles training, checkpointing, and prediction.

Baseline models require no training — fit() only stores metadata — making them fast reference points for computing skill scores. See Baseline Models and Model Catalog for the full list.

Forecasting Types#

Twiga supports three types of forecasting:

Point Forecasting#

Produces a single predicted value for each future time step. All models support point forecasting by default.

predictions = forecaster.predict(test_df)

Probabilistic Forecasting#

Produces a distribution of predicted values, either via quantile regression, parametric distributions, or a distribution-free conformal step.

ML probabilistic models:

QRCATBOOSTModel, QRXGBOOSTModel, QRLIGHTGBMModel — quantile regression
GAUSSCATBOOSTModel — Gaussian (mean + sigma) output

NN probabilistic models use a composable backbone/head design. Every architecture (MLPF, MLPGAM, MLPGAF, NHITS) can be paired with any distribution head by selecting the appropriate config:

Distribution	Use case	Example config
Normal	Symmetric, unbounded targets	`MLPFNormalConfig`, `NHITSNormalConfig`
Laplace	Heavy-tailed, outlier-robust	`MLPFLaplaceConfig`
LogNormal	Strictly positive, right-skewed	`MLPGAMLogNormalConfig`
Gamma	Strictly positive, flexible skew	`MLPGAFGammaConfig`
Beta	Bounded [0, 1] targets	`NHITSBetaConfig`
StudentT	Very heavy tails	`MLPGAMStudentTConfig`
QR	Fixed-grid quantile regression	`MLPFQRConfig`, `NHITSQRConfig`
FPQR	Adaptive quantile proposals	`MLPGAMFPQRConfig`
CRC	Conformal residual coverage	`MLPGAMCRCConfig`

See Distribution Families for the backbone/head architecture and Quantile Regression for the QR-specific approach.

Interval Forecasting#

Produces prediction intervals (lower and upper bounds) via conformal prediction. This is a post-hoc calibration step applied to any trained model.

# Calibrate conformal prediction on held-out data
forecaster.calibrate(calibration_df)

# Generate prediction intervals
intervals = forecaster.predict_interval(test_df)

Next: Quick Start Guide