Neural Network Models#

Source Files

twiga/models/nn/core/base.py - BaseNeuralForecast (training orchestration)
twiga/models/nn/core/base_model.py - BaseNeuralModel (LightningModule wrapper)
twiga/models/nn/core/base_arch.py - BaseArchitecture (nn.Module ABC)
twiga/models/nn/core/embedding.py - Value & positional embeddings
twiga/models/nn/core/linear.py - create_linear, MLPBlock, PastFutureEncoder, FeedForward
twiga/models/nn/mlpf_model.py - MLPFModel / MLPFConfig
twiga/models/nn/mlpgam_model.py - MLPGAMModel / MLPGAMConfig
twiga/models/nn/mlpgaf_model.py - MLPGAFModel / MLPGAFConfig
twiga/models/nn/nhits_model.py - NHITSModel / NHITSConfig
twiga/models/nn/rnn_model.py - RNNModel / RNNConfig
twiga/models/nn/mlpfqr_model.py - MLPFQRModel / MLPFQRConfig
twiga/models/nn/mlpfcrc_model.py - MLPFCRCModel / MLPFCRCConfig
twiga/models/nn/mlpgamcrc_model.py - MLPGAMCRCModel / MLPGAMCRCConfig
twiga/models/nn/mlpgafcrc_model.py - MLPGAFCRCModel / MLPGAFCRCConfig
twiga/models/nn/nhitscrc_model.py - NHITSCRCModel / NHITSCRCConfig
twiga/models/nn/net/ - Network architecture implementations
twiga/models/nn/prob/core.py - ProbabilisticModel / ProbabilisticNetwork
twiga/models/nn/prob/rnn_parametric.py - RNN parametric distribution wrappers
twiga/models/nn/prob/rnn_qr.py - RNNQR / RNNFPQR
twiga/models/nn/prob/ - Probabilistic network variants (parametric, QR, FPQR, CRC)

Twiga provides five neural network forecasting models built on PyTorch Lightning. Every model follows the same three-layer design: a config (Pydantic model inheriting from NeuralModelConfig), a model wrapper (inheriting from BaseNeuralForecast), and an architecture (inheriting from BaseArchitecture). This page documents the base class hierarchy, each model’s architecture and configuration, and how to create neural network models within the TwigaForecaster workflow.

Base Class Hierarchy#

The diagram below shows how the three core base classes relate to each other and to PyTorch Lightning.

        classDiagram
    direction TB

    class ABC {
        <<stdlib>>
    }

    class nn_Module {
        <<PyTorch>>
    }

    class LightningModule {
        <<PyTorch Lightning>>
    }

    class BaseNeuralForecast {
        <<ABC>>
        +wandb_logging: bool
        +rich_progress_bar: bool
        +batch_size: int
        +max_epochs: int
        +seed: int
        +model: LightningModule
        +trainer: Trainer
        +update(trial)* ABC
        +load_checkpoint()* ABC
        +fit(X_train, y_train, X_val, y_val, trial)
        +forecast(features) Tensor
    }

    class BaseNeuralModel {
        <<LightningModule>>
        +OPTIMIZERS: dict
        +SCHEDULERS: dict
        +tra_metric_fcn: Metric
        +val_metric_fcn: Metric
        +forward(x)
        +forecast(x)
        +training_step(batch, batch_idx)
        +validation_step(batch, batch_idx)
        +configure_optimizers()
    }

    class BaseArchitecture {
        <<ABC, nn.Module>>
        +num_target_feature: int
        +forecast_horizon: int
        +lookback_window_size: int
        +dropout: float
        +alpha: float
        +encode_size: int
        +forward(x)* ABC
        +encode(x) Tensor
        +forecast(x) dict
        +step(batch, metric_fn, epoch)
        +penalty_dict(epoch) dict
    }

    ABC <|-- BaseNeuralForecast
    LightningModule <|-- BaseNeuralModel
    nn_Module <|-- BaseArchitecture
    ABC <|-- BaseArchitecture

    BaseNeuralForecast o-- BaseNeuralModel : model
    BaseNeuralModel o-- BaseArchitecture : model

    BaseNeuralForecast <|-- MLPFModel
    BaseNeuralForecast <|-- MLPGAMModel
    BaseNeuralForecast <|-- MLPGAFModel
    BaseNeuralForecast <|-- NHITSModel
    BaseNeuralForecast <|-- RNNModel
    BaseNeuralForecast <|-- MLPFQRModel
    BaseNeuralForecast <|-- MLPFCRCModel
    BaseNeuralForecast <|-- MLPGAMCRCModel
    BaseNeuralForecast <|-- MLPGAFCRCModel
    BaseNeuralForecast <|-- NHITSCRCModel

BaseNeuralForecast#

BaseNeuralForecast is the top-level abstract class that every neural model wrapper inherits from. It owns the PyTorch Lightning Trainer, manages checkpointing and logging, and exposes the public fit / forecast API.

Responsibility	Method	Details
Hyperparameter update	`update(trial)`	Abstract - each model subclass reinitializes itself from an Optuna trial.
Checkpoint loading	`load_checkpoint()`	Abstract - loads the best checkpoint for the specific architecture.
Trainer setup	`_set_up_trainer(trial=None)`	Creates a `pl.Trainer` with early stopping (or Optuna pruning), model checkpointing, LR monitoring, and a progress bar.
Logger setup	`_configure_logger()`	`WandbLogger` when `wandb_logging=True`, otherwise `TensorBoardLogger`.
Training	`fit(X_train, y_train, X_val, y_val, trial)`	Wraps data in a `TimeseriesDataModule`, optionally resumes from the latest checkpoint, and calls `trainer.fit()`.
Inference	`forecast(features: Tensor)`	Switches to eval mode, runs a forward pass with `torch.no_grad()`, and returns NumPy arrays.

BaseNeuralModel#

BaseNeuralModel extends pl.LightningModule and handles the training loop mechanics: optimizer/scheduler creation, metric logging, and dual-optimizer support for models that learn both a mean (mu) and a variance (sigma) head.

Built-in optimizers:

Key	Class	Default LR	Default Weight Decay
`adam`	`torch.optim.Adam`	`1e-3`	`1e-6`
`adamw`	`torch.optim.AdamW`	`1e-3`	`1e-5`

Built-in schedulers:

Key	PyTorch Class	Notes
`multi_step`	`MultiStepLR`	Milestones at `prob_decay_1` and `prob_decay_2` fractions of `max_epochs`
`cosine_annealing`	`CosineAnnealingWarmRestarts`	`T_0` computed from `prob_T0 * max_epochs`
`reduce_on_plateau`	`ReduceLROnPlateau`	Patience computed from `prob_patience * max_epochs`
`one_cycle`	`OneCycleLR`	Uses `estimated_stepping_batches` for total steps
`warmup_multi_step`	`SequentialLR(LambdaLR, MultiStepLR)`	Linear warmup followed by multi-step decay

Metric mapping:

Config string	Torchmetrics Class
`"mae"`	`MeanAbsoluteError`
`"mse"`	`MeanSquaredError`
`"smape"`	`SymmetricMeanAbsolutePercentageError`

BaseArchitecture#

BaseArchitecture is the nn.Module ABC that every network architecture extends. It defines the data dimensions, the default composite loss function, and the forecast method.

The default loss function in step() combines MSE and MAE with a tunable alpha:

loss = alpha * MSE(y_pred, y) + (1 - alpha) * MAE(y_pred, y)

Models that need custom losses (MLPGAF, MLPFQR) override step() in their own architecture class.

BaseArchitecture also defines the probabilistic backbone contract:

encode(x) → Tensor - maps the raw input (B, T, F) to a latent vector (B, encode_size). The default implementation raises NotImplementedError; each architecture (MLPF, MLPGAM, MLPGAF, NHITS) overrides it.
encode_size: int property - returns self._encode_size, set in each architecture’s __init__.
penalty_dict(epoch) → dict - returns architecture-specific regularisation terms for probabilistic training. Defaults to {}.

These three methods are the interface that ProbabilisticModel relies on to wire a backbone to any distribution head.

Embeddings#

The embedding system uses a registry pattern. Available embedding types:

Category	Registry Key	Class	Description
Value	`"ConvEmb"`	`ConvEmbedding`	1D circular convolution (kernel=3) that projects input channels to `hidden_size`
Value	`"LinearEmb"`	`LinearEmbedding`	Simple linear projection from `c_in` to `hidden_size`
Positional	`"RotaryEmb"`	`RotaryEmbedding`	Rotary positional encoding (RoPE) with dynamic buffer extension
Positional	`"PosEmb"`	`PositionalEmbedding`	Classic sinusoidal positional encoding
Positional	`"TimeEmb"`	`TimeDEmbedding`	Learned linear time embedding with ReLU activation

DataEmbedding combines one value embedding and one positional embedding (both optional) with dropout.

Linear Layers and Building Blocks#

Component	Description
`create_linear(in_ch, out_ch, bn, activation, use_bias)`	Factory that creates `nn.Linear` with activation-aware weight initialization (Kaiming, Xavier, or LeCun depending on the activation function). Optionally wraps with `BatchNorm1d`.
`MLPBlock`	Multi-layer perceptron: input is flattened from `(batch, context_size, in_size)`, passed through `num_layers` linear layers with optional residual connections, and projected to `latent_dim`.
`PastFutureEncoder`	`LayerNorm` then optional `DataEmbedding` then `MLPBlock`. Used to encode past or future feature groups independently.
`FeedForward`	Expand-activate-dropout-contract block: `dim` to `dim * expansion_factor` and back.

Point Forecast Models#

MLPF - MLP Forecaster#

An MLP-based forecaster with attention-based feature combination. Separate PastFutureEncoder modules process each feature group (target, historical, calendar, exogenous, future covariates), and the resulting latent vectors are combined via multi-head attention, learned weights, or simple addition.

Config class: MLPFConfig

Registered name: "mlpf" | Domain: "nn" | Extends: NeuralModelConfig

Field	Type	Default	Description
`num_target_feature`	`int`	required	Number of target variables to forecast
`num_historical_features`	`int`	`0`	Historical (unknown-future) feature count
`num_calendar_features`	`int`	`0`	Calendar feature count
`num_exogenous_features`	`int`	`0`	Exogenous (known-future) feature count
`num_future_covariates`	`int`	`0`	Future-only covariate count
`forecast_horizon`	`int`	required	Steps ahead to predict
`lookback_window_size`	`int`	required	Input window length
`embedding_size`	`int`	`28`	Dimensionality of the embedding space
`embedding_type`	`Literal["RotaryEmb", "PosEmb"] \| None`	`None`	Positional embedding type
`value_embed_type`	`Literal[None, "ConvEmb", "LinearEmb"]`	`None`	Value embedding type
`combination_type`	`Literal["attn-comb", "weighted-comb", "addition-comb"]`	`"addition-comb"`	Strategy for combining encoded feature groups
`hidden_size`	`int`	`256`	Hidden layer dimensionality
`num_layers`	`int`	`2`	Number of MLP layers per encoder
`activation_function`	`str`	`"SiLU"`	Hidden layer activation
`out_activation_function`	`str`	`"Identity"`	Output activation
`dropout`	`float`	`0.25`	Dropout probability
`alpha`	`float`	`0.1`	MSE/MAE loss weighting
`num_attention_heads`	`int`	`4`	Heads for `"attn-comb"` combination

MLPGAM - MLP Generalized Additive Model#

Extends the MLPF architecture with a generalized additive model (GAM) structure: each feature group contributes independently to the final forecast via its own linear projection, and the outputs are summed. A Lasso penalty (lambda_lasso) encourages sparsity across group contributions.

Config class: MLPGAMConfig

Registered name: "mlpgam" | Domain: "nn" | Extends: NeuralModelConfig

MLPGAMConfig shares all fields with MLPFConfig (except combination_type and num_attention_heads) and adds:

Field	Type	Default	Description
`lambda_lasso`	`float`	`1e-6`	Lasso (L1) penalty coefficient for group-level sparsity

MLPGAF - MLP with Gated Additive Features#

Builds on the GAM structure with learned feature gates and an optional sigma head for uncertainty estimation. Feature gates use a sigmoid function (controlled by gate_scale) to perform soft feature selection, and a warm-up schedule (warmup_epochs) gradually activates the gate penalty.

Config class: MLPGAFConfig

Registered name: "mlpgaf" | Domain: "nn" | Extends: NeuralModelConfig

Field	Type	Default	Description
`embedding_type`	`None`	`None`	Fixed to `None` for MLPGAF
`value_embed_type`	`Literal["ConvEmb", "LinearEmb"] \| None`	`"LinearEmb"`	Value embedding strategy
`use_bias`	`bool`	`True`	Include bias terms in linear layers
`bn`	`bool`	`True`	Use batch normalization
`residual_connection`	`bool`	`False`	Enable residual connections in MLP blocks
`hidden_size`	`int`	`256`	Hidden layer dimensionality
`num_layers`	`int`	`2`	Number of MLP layers
`activation_function`	`Literal["SiLU", "ELU", "LeakyReLU"]`	`"SiLU"`	Hidden activation (restricted set)
`dropout`	`float`	`0.1`	Dropout probability
`alpha`	`float`	`0.2`	MSE/MAE loss weighting
`lambda_weight`	`float`	`1e-6`	Lasso penalty for group weights
`lambda_gate`	`float`	`1e-4`	Lasso penalty for feature gates
`delta`	`float`	`0.05`	Huber-loss delta for gate penalty
`lambda_sigma`	`float`	`1e-6`	Penalty coefficient for the sigma (residual) head
`gate_scale`	`float`	`5.0`	Temperature scaling for gates (higher = more binary)
`gate_type`	`Literal["sigmoid"]`	`"sigmoid"`	Gating mechanism type
`warmup_epochs`	`int`	`10`	Epochs before gate penalty is fully active
`sigma_type`	`str`	`"Residual"`	Uncertainty modeling type for the sigma head

N-HiTS - Neural Hierarchical Interpolation for Time Series#

N-HiTS decomposes the forecasting problem into multiple resolution levels using stacked hierarchical interpolation blocks. Each stack processes the input at a different temporal resolution:

Input pooling - a MaxPool1d or AvgPool1d layer with a per-stack kernel size (n_pool_kernel_size) downsamples the lookback window to capture patterns at different scales.
MLP processing - within each stack, one or more MLP blocks (n_blocks) with configurable layer sizes (mlp_units) produce basis expansion coefficients.
Interpolation - the low-resolution coefficients are upsampled back to the forecast horizon using the chosen interpolation_mode (linear, nearest, or cubic), with a per-stack downsampling factor (n_freq_downsample) controlling the number of coefficients.
Hierarchical summation - forecasts from all stacks are summed to produce the final prediction, combining coarse global trends with fine local patterns.

Config class: NHITSConfig

Registered name: "nhits" | Domain: "nn" | Extends: NeuralModelConfig

Field	Type	Default	Description
`stack_types`	`list[Literal["identity"]]`	`["identity", "identity", "identity"]`	Block type per stack
`n_blocks`	`list[int]`	`[1, 1, 1]`	Number of MLP blocks per stack
`mlp_units`	`list[list[int]]`	`[[512, 512], [512, 512], [512, 512]]`	Layer sizes in each block’s MLP
`n_pool_kernel_size`	`list[int]`	`[2, 2, 1]`	Pooling kernel size per stack
`n_freq_downsample`	`list[int]`	`[4, 2, 1]`	Frequency downsampling factor per stack
`pooling_mode`	`Literal["MaxPool1d", "AvgPool1d"]`	`"MaxPool1d"`	Pooling strategy
`interpolation_mode`	`Literal["linear", "nearest", "cubic"]`	`"linear"`	Upsampling interpolation method
`decompose_forecast`	`bool`	`False`	Output per-stack decomposition for interpretability
`activation_function`	`str`	`"ReLU"`	Hidden layer activation
`dropout`	`float`	`0.25`	Dropout probability
`alpha`	`float`	`0.1`	MSE/MAE loss weighting

Configuring N-HiTS stacks

The lists stack_types, n_blocks, mlp_units, n_pool_kernel_size, and n_freq_downsample must all have the same length (one entry per stack). Use larger pooling kernels and downsampling factors in earlier stacks to capture long-range trends, and smaller values in later stacks for short-range detail.

RNN - Recurrent Neural Network Forecaster#

A GRU/LSTM-based sequence model. The lookback window is processed step-by-step by a stacked recurrent network, and the final hidden state is projected to the forecast horizon. A LayerNorm is applied to the latent vector before projection, and an optional bidirectional mode doubles the encoding capacity.

Key architectural details:

Cell type — "gru" (default) or "lstm". GRU has fewer parameters and often trains faster; LSTM provides gated memory that can help on longer lookback windows.
Bidirectional — when enabled, a forward and backward pass over the past window are concatenated, doubling encode_size = hidden_size × 2.
Encode contract — encode(x) extracts the past feature window, runs the RNN, extracts the final hidden state (concatenating both directions if bidirectional), and applies LayerNorm. This makes RNNForecastNetwork directly compatible with all parametric and quantile distribution heads.
Multi-layer dropout — inter-layer dropout is active only when n_layers > 1; a single-layer RNN has no dropout applied by PyTorch to the recurrent connections.

Config class: RNNConfig

Registered name: "rnn" | Domain: "nn" | Extends: NeuralModelConfig

Field	Type	Default	Description
`num_target_feature`	`int`	`0`	Number of target variables to forecast
`num_historical_features`	`int`	`0`	Historical (unknown-future) feature count
`num_calendar_features`	`int`	`0`	Calendar feature count
`num_exogenous_features`	`int`	`0`	Exogenous (known-future) feature count
`num_future_covariates`	`int`	`0`	Future-only covariate count
`forecast_horizon`	`int`	`0`	Steps ahead to predict
`lookback_window_size`	`int`	`0`	Input window length
`hidden_size`	`int`	`128`	Width of the RNN hidden state
`n_layers`	`int`	`2`	Number of stacked RNN layers
`cell_type`	`Literal["gru", "lstm"]`	`"gru"`	RNN cell type
`bidirectional`	`bool`	`False`	Use a bidirectional RNN
`dropout`	`float`	`0.25`	Dropout probability (inter-layer only when `n_layers > 1`)
`alpha`	`float`	`0.1`	MSE/MAE loss weighting
`out_activation_function`	`str`	`"Identity"`	Output activation

Bidirectional encoding

Setting bidirectional=True doubles encode_size from hidden_size to hidden_size × 2. This increases model capacity at the cost of slightly more parameters, and makes it possible to attend to both past and “future” context within the lookback window.

Probabilistic Models#

Probabilistic NN models use the backbone/head architecture from twiga/models/nn/prob/core.py. Each architecture (MLPF, MLPGAM, MLPGAF, NHITS) can be paired with any distribution head by selecting the appropriate config class. TwigaForecaster resolves the pairing automatically when you pass a distribution field or use a named probabilistic config.

How the backbone/head pattern works#

# ProbabilisticModel wires these together automatically:
backbone = MLPFNetwork(**backbone_kwargs)           # implements encode()
head = NormalDistribution(hidden_size=backbone.encode_size, ...)
model = ProbabilisticNetwork(backbone, head)

# Training step:
z = backbone.encode(x)          # (B, encode_size)
loss, metric = head.step(z, y, metric_fn, epoch)

Because hidden_size is injected from backbone.encode_size, distribution heads are backbone-agnostic - the same NormalDistribution head works with MLPF, MLPGAM, MLPGAF, and NHITS without modification.

Available probabilistic variants#

Each base architecture supports these distribution heads:

Distribution	MLPF	RNN	Config suffix
Normal			`NormalConfig`
Laplace			`LaplaceConfig`
LogNormal			`LogNormalConfig`
Gamma			`GammaConfig`
Beta			`BetaConfig`
StudentT	—		`StudentTConfig`
QR			`QRConfig`
FPQR			`FPQRConfig`
CRC		—	`CRCConfig`

from twiga.models.nn.mlpfnormal_model import MLPFNormalConfig
from twiga.models.nn.nhitsgamma_model import NHITSGammaConfig
from twiga.models.nn.mlpgamqr_model import MLPGAMQRConfig

# All created and passed to TwigaForecaster the same way as point models
mlpf_normal = MLPFNormalConfig.from_data_config(data_config, max_epochs=50)
nhits_gamma  = NHITSGammaConfig(num_target_feature=1, forecast_horizon=24,
                                lookback_window_size=168, max_epochs=50)
mlpgam_qr    = MLPGAMQRConfig.from_data_config(data_config, quantiles=[0.1, 0.5, 0.9])

Alternatively, set the distribution field on any base config and TwigaForecaster resolves the probabilistic variant automatically:

from twiga.models.nn.mlpf_model import MLPFConfig

# Equivalent to MLPFNormalConfig
mlpf_config = MLPFConfig.from_data_config(data_config, distribution="normal")

MLPF-QR - MLPF with Quantile Regression#

Extends the MLPF architecture with a multi-quantile output head. Instead of producing a single point forecast, MLPF-QR outputs one forecast per quantile, trained with either a pinball loss or a smoothed Huber-pinball variant.

Config class: MLPFQRConfig

Registered name: "mlpfqr" | Domain: "nn" | Extends: MLPFConfig

Field	Type	Default	Description
`quantiles`	`list[float] \| None`	`None`	Quantile levels to forecast (e.g., `[0.1, 0.5, 0.9]`). When `None`, quantiles are derived from `conf_level`.
`conf_level`	`float`	`0.05`	Confidence level for automatically computed quantile bounds
`loss_fn`	`Literal["pinball", "huber-pinball"]`	`"pinball"`	Loss function
`kappa`	`float`	`0.25`	Smoothing parameter for the Huber-pinball loss
`eps`	`float`	`1e-6`	Epsilon for numerical stability

All other fields are inherited from MLPFConfig. See Quantile Regression for details on the loss functions and quantile calibration.

MLPGAM-CRC - MLPGAM with Conditional Residual Calibration#

Pairs the MLPGAM backbone with an AdditiveCRCDistribution head. The backbone’s additive mean (the sum of group-wise projections) is used directly as \(\mu\) — no extra linear projection is applied — so the GAM decomposition is fully preserved. A two-layer sigma MLP, conditioned on the detached \(\mu\), learns to predict the per-step absolute residual \(|y - \mu|\).

Config class: MLPGAMCRCConfig

Registered name: "mlpgamcrc" | Domain: "nn" | Extends: MLPGAMConfig

MLPGAMCRCConfig inherits every field from MLPGAMConfig (including lambda_lasso) and adds:

Field	Type	Default	Description
`two_stage`	`bool`	`True`	Enable two-stage training: Stage 1 trains the backbone and mu path; Stage 2 freezes the backbone and trains only the sigma MLP. Set `False` to train jointly with a single optimizer.
`stage1_epochs`	`int \| None`	`None`	Epochs allocated to Stage 1 (backbone + mu only). When set, uses sequential two-stage training: epochs `0..stage1_epochs-1` train mu; epochs `stage1_epochs..max_epochs` train sigma with frozen backbone. When `None`, uses per-batch optimizer alternation.
`sigma_loss_fn`	`Literal["hybrid_sqrt", "hybrid", "gaussian", "laplace", "mse", "log"]`	`"hybrid_sqrt"`	Calibration objective for the scale estimate σ. `"hybrid_sqrt"` uses variance-stabilised residuals; see CRC Distribution for all options.
`alpha`	`float`	`0.1`	Weight for MSE vs MAE in the `"hybrid"` / `"hybrid_sqrt"` sigma loss.
`monitor`	`Literal["loss", "mae", "mse", "smape", "sigma_loss"] \| None`	`None`	Metric key for EarlyStopping and ModelCheckpoint. `None` defers to the `metric` field. Use `"sigma_loss"` to monitor sigma calibration loss during CRC Stage 2 training.

See CRC Distribution for the full head architecture and loss equations.

Model Comparison#

Model	Type	Feature Combination	Interpretable	Sparsity	Probabilistic variants	Best For
MLPF	Point	Attention / weighted / additive	-	-	Normal, Laplace, LogNormal, Gamma, Beta, QR, FPQR, CRC	General-purpose deep forecasting
MLPGAM	Point	Additive (GAM)	Yes	L1 Lasso	Normal, Laplace, LogNormal, Gamma, Beta, StudentT, QR, FPQR, CRC	Interpretable group-additive forecasts
MLPGAF	Point	Gated additive features	Yes	Gate + weight Lasso	Normal, Laplace, LogNormal, Gamma, Beta, StudentT, QR, FPQR, CRC	Feature selection with uncertainty
N-HiTS	Point	Hierarchical stack sum	Decomposable	-	Normal, Laplace, LogNormal, Gamma, Beta, StudentT, QR, FPQR, CRC	Multi-resolution temporal patterns
RNN	Point	Recurrent (GRU/LSTM)	-	-	Normal, Laplace, LogNormal, Gamma, Beta, StudentT, QR, FPQR	Sequential dependencies, variable-length lookback

Choosing a distribution

Use Normal/Laplace for general energy targets. Gamma/LogNormal for strictly positive variables (solar generation, prices). Beta for capacity factors or state-of-charge (bounded [0, 1]). StudentT for spot prices with very heavy tails. QR/FPQR when you want quantiles directly without a parametric assumption.

Usage#

Creating a Model via `from_data_config`#

The recommended way to instantiate a neural network model is through the from_data_config classmethod on the config class. This automatically computes the input and output dimensions from your DataPipelineConfig.

from sklearn.preprocessing import StandardScaler, RobustScaler

from twiga.core.config import DataPipelineConfig, ForecasterConfig
from twiga.forecaster.core import TwigaForecaster
from twiga.models.nn.mlpf_model import MLPFConfig
from twiga.models.nn.nhits_model import NHITSConfig
from twiga.models.nn.rnn_model import RNNConfig

# 1. Define data pipeline
data_config = DataPipelineConfig(
    target_feature="load_mw",
    period="1h",
    lookback_window_size=168,
    forecast_horizon=48,
    calendar_features=["hour", "dayofweek", "month"],
    exogenous_features=["ghi", "temperature"],
    input_scaler=StandardScaler(),
    target_scaler=RobustScaler(),
)

# 2. Create model configs - dimensions are inferred automatically
mlpf_config = MLPFConfig.from_data_config(
    data_config,
    max_epochs=50,
    batch_size=128,
    hidden_size=256,
    num_layers=3,
    patience=10,
)

nhits_config = NHITSConfig.from_data_config(
    data_config,
    max_epochs=50,
    batch_size=128,
    n_pool_kernel_size=[2, 2, 1],
    n_freq_downsample=[4, 2, 1],
    patience=10,
)

rnn_config = RNNConfig.from_data_config(
    data_config,
    max_epochs=50,
    batch_size=128,
    hidden_size=128,
    n_layers=2,
    cell_type="gru",
    bidirectional=False,
    patience=10,
)

# 3. Train via TwigaForecaster
train_config = ForecasterConfig(
    split_freq="months",
    train_size=6,
    test_size=1,
    window="expanding",
    project_name="EnergyForecast",
)

forecaster = TwigaForecaster(
    data_params=data_config,
    model_params=[mlpf_config, nhits_config, rnn_config],
    train_params=train_config,
)

forecaster.fit(train_df=train_df, val_df=val_df)
predictions, metrics = forecaster.evaluate_point_forecast(test_df=test_df)

Always use from_data_config for NN models

Neural models require dimensional parameters (num_target_feature, num_calendar_features, etc.) that must match the data pipeline exactly. Using from_data_config guarantees consistency. Setting these values manually is error-prone and not recommended.

Probabilistic Forecasting with MLPF-QR#

from twiga.models.nn.mlpfqr_model import MLPFQRConfig

mlpfqr_config = MLPFQRConfig.from_data_config(
    data_config,
    max_epochs=50,
    batch_size=128,
    quantiles=[0.1, 0.5, 0.9],
    loss_fn="pinball",
    patience=10,
)

See Quantile Regression for a full walkthrough.

API Reference#

Base classes#

class twiga.models.nn.core.base.BaseNeuralForecast(rich_progress_bar=False, wandb_logging=False, drop_last=True, num_workers=8, batch_size=64, pin_memory=True, max_epochs=1, seed=123, resume_training=True, checkpoints_path='checkpoints', logs_path='logs', project_name='twiga', metric='mae', patience=None, direction='min', gradient_clip_val=None)#

Bases: ABC

Base class for all neural forecasting models in Twiga.

Variables:

wandb_logging – Whether to use Weights & Biases logging.
rich_progress_bar – Whether to use rich progress bar.
drop_last – Whether to drop last incomplete batch.
num_workers – Number of workers for data loading.
batch_size – Number of samples per batch.
max_epochs – Maximum number of training epochs.
pin_memory – Whether to pin memory for faster GPU transfer.
seed – Random seed for reproducibility.
resume_training – Whether to resume from latest checkpoint.
model – The neural network model.
trainer – PyTorch Lightning trainer instance.
checkpoints_path – Path to save model checkpoints.
logs_path – Path to save training logs.
project_name – Name of the project for logging.
file_name (str | None) – Base name for log files.

__init__(rich_progress_bar=False, wandb_logging=False, drop_last=True, num_workers=8, batch_size=64, pin_memory=True, max_epochs=1, seed=123, resume_training=True, checkpoints_path='checkpoints', logs_path='logs', project_name='twiga', metric='mae', patience=None, direction='min', gradient_clip_val=None)#

Initialize base neural forecaster.

Parameters:

rich_progress_bar (bool) – Use rich progress bar. Defaults to True.
wandb_logging (bool) – Enable Weights & Biases logging. Defaults to False.
drop_last (bool) – Drop last incomplete batch. Defaults to True.
num_workers (int) – Number of data loader workers. Defaults to 8.
batch_size (int) – Batch size for training. Defaults to 64.
pin_memory (bool) – Pin memory for faster GPU transfer. Defaults to True.
max_epochs (int) – Maximum training epochs. Defaults to 1.
seed (int) – Random seed. Defaults to 123.
resume_training (bool) – Resume from latest checkpoint. Defaults to True.
checkpoints_path (str | Path) – Path for model checkpoints. Defaults to “checkpoints”.
logs_path (str | Path) – Path for training logs. Defaults to “logs”.
project_name (str) – Project name for logging. Defaults to “twiga”.
#file_name – Base filename for logs. Defaults to None.
metric (str) – Metric to optimize. Defaults to “mae”.
patience (int | None) – Early stopping patience. Defaults to None.
direction (str) – Optimization direction. Defaults to “min”.
gradient_clip_val (float | None) – Max gradient norm for clipping passed to pl.Trainer. None disables clipping.

fit(X_train, y_train, X_val=None, y_val=None, trial=None)#

Train the model on given data.

Parameters:

model – The neural network model to train.
X_train (ndarray) – Training input features.
y_train (ndarray) – Training target values.
X_val (ndarray | None) – Validation input features. Defaults to None.
y_val (ndarray | None) – Validation target values. Defaults to None.
trial (Trial | None) – Optuna trial for hyperparameter optimization. Defaults to None.

Raises:

RuntimeError – If model not initialized or training fails.

Return type:

None

forecast(features)#

Generate forecasts using the trained model.

Parameters:: features (Tensor) – Input features for forecasting.
Return type:: Tensor
Returns:: Forecasted values as a numpy array.

abstractmethod load_checkpoint(checkpoints_path)#

Load model from a checkpoint.

Parameters:: checkpoints_path – Path to the checkpoint file.

abstractmethod update(trial)#

Update the model with new hyperparameters.

Parameters:: trial (Trial) – Optuna trial object containing hyperparameters.
Return type:: None

class twiga.models.nn.core.base_model.BaseNeuralModel(metric='mae', max_epochs=10, optimizer_type=None, lr_scheduler_type=None, checkpoints_path='./', optimizer_params=None, scheduler_params=None)#

Bases: LightningModule

Base class for neural network models using PyTorch Lightning.

Variables:

model (torch.nn.Module, optional) – The PyTorch model for inference.
data_pipeline (sklearn.pipeline.Pipeline, optional) – Data preprocessing pipeline.
tra_metric_fcn (torchmetrics.Metric) – Metric function for training.
val_metric_fcn (torchmetrics.Metric) – Metric function for validation.
checkpoints_path (str) – Directory path for saving checkpoints.
hparams (dict) – Hyperparameters saved during initialization.
OPTIMIZERS (dict) – Dictionary of optimizer types, mapping to their classes and parameters (lr, weight_decay, prob_decay_1, prob_decay_2).
SCHEDULERS (dict) – Dictionary of scheduler types, mapping to their classes and parameters (gamma, scheduler-specific params).

__init__(metric='mae', max_epochs=10, optimizer_type=None, lr_scheduler_type=None, checkpoints_path='./', optimizer_params=None, scheduler_params=None)#

Initializes the BaseNeuralModel.

Parameters:

metric (str) – Evaluation metric. Must be one of {‘mae’, ‘mse’, ‘smape’}. Defaults to ‘mae’.
max_epochs (int) – Maximum number of training epochs. Must be positive. Defaults to 10.
optimizer_type (str | None) – Optimizer type to use. Must be one of the supported optimizers. Defaults to ‘AdamW’.
lr_scheduler_type (str | None) – Learning rate scheduler type to use. Must be one of the supported schedulers. Defaults to ‘multi_step’.
checkpoints_path (str) – Directory to save checkpoints. Must exist. Defaults to ‘./’.
optimizer_params (dict | None) – Dictionary of optimizer types, mapping to their classes and parameters. Each entry must include ‘class’ (callable) and ‘params’ (dict with ‘lr’, ‘weight_decay’, ‘prob_decay_1’, ‘prob_decay_2’). If None, uses default OPTIMIZERS. Defaults to None.
scheduler_params (dict | None) – Dictionary of scheduler types, mapping to their classes and parameters. Each entry must include ‘class’ (callable) and ‘params’ (dict with ‘gamma’ and scheduler-specific parameters). If None, uses default SCHEDULERS. Defaults to None.

Raises:

ValueError – If any parameter is invalid (e.g., invalid metric, negative max_epochs, non-existent checkpoints_path, invalid dictionary format).

calculate_model_size()#

Calculate the size of the model in MB.

Return type:: float
Returns:: float – Model size in MB.

configure_optimizers()#

Configures optimizers and learning rate schedulers.

Uses parameters from OPTIMIZERS (lr, weight_decay) and SCHEDULERS (gamma, scheduler-specific params). Supported optimizers (all native torch.optim): ‘adam’, ‘adamw’, ‘nadam’, ‘radam’, ‘adamax’, ‘adafactor’, ‘adagrad’, ‘adadelta’, ‘rmsprop’, ‘rprop’, ‘asgd’, ‘sgd’, ‘muon’. Supported schedulers (all native torch.optim.lr_scheduler): ‘step’, ‘multi_step’, ‘multiplicative’, ‘exponential’, ‘constant’, ‘linear_decay’, ‘polynomial’, ‘cosine_annealing’, ‘cosine_annealing_lr’, ‘cyclic’, ‘reduce_on_plateau’, ‘one_cycle’, ‘warmup_multi_step’, ‘warmup_cosine’.

Return type:

list[Optimizer] | dict

Returns:

list[optim.Optimizer] | dict –

List of optimizers and schedulers,: or a dictionary for schedulers requiring monitoring (e.g., ReduceLROnPlateau).

Raises:

ValueError – If optimizer or scheduler type is invalid.
ImportError – If ‘muon’ or ‘lion’ is selected but the package is not installed.

forecast(x)#

Generate forecast for the given input.

Parameters:: x (Tensor) – Input data for forecasting.
Return type:: Tensor
Returns:: torch.Tensor – Forecasted values.

forward(x)#

Forward pass of the model.

Parameters:: x (Tensor) – Input data.
Return type:: Tensor
Returns:: torch.Tensor – Output of the model.

on_load_checkpoint(checkpoint)#

Load the data pipeline from a file.

Parameters:: checkpoint (dict) – Checkpoint dictionary.
Return type:: None

on_save_checkpoint(checkpoint)#

Save the data pipeline to a file and add the file path to the checkpoint dictionary.

Parameters:: checkpoint (dict) – Checkpoint dictionary.
Return type:: None

on_train_epoch_start()#

Switch EarlyStopping monitor from backbone metric to sigma loss at stage-2 boundary.

When sequential two-stage training is active (stage1_epochs is set), the validation metric logged in stage 1 (val_mae) is constant in stage 2 because the backbone is frozen. Without intervention, EarlyStopping fires after patience epochs in stage 2.

At the first epoch of stage 2 this hook resets the ES state and changes its monitor to "val_sigma_loss" so that early stopping tracks sigma progress.

Return type:: None

training_step(batch, batch_idx)#

Perform a single training step.

Parameters:

batch (tuple[Tensor, Tensor]) – A batch of training data.
batch_idx (int) – Index of the batch.

Return type:

Tensor

Returns:

torch.Tensor – The loss value for the batch.

static validate_params(metric, max_epochs, opt_type, opt_params, sch_type, sch_params)#: Full rigor validation logic without instance side-effects.

validation_step(batch, batch_idx)#

Perform a single validation step.

Parameters:

batch (tuple[Tensor, Tensor]) – A batch of validation data.
batch_idx (int) – Index of the batch.

Return type:

None

Returns:

None

class twiga.models.nn.core.base_arch.BaseArchitecture(num_target_feature, num_historical_features, num_calendar_features, num_exogenous_features, num_future_covariates, forecast_horizon, lookback_window_size, dropout=0.25, alpha=0.1, output_activation='Identity')#

Bases: Module, ABC

A neural network architecture.

Variables:

num_target_feature (int) – Number of target series to predict.
num_historical_features (int) – Number of unknown exogenous features.
num_calendar_features (int) – Number of known calendar-based features.
num_exogenous_features (int) – Number of known continuous features.
num_future_covariates (int) – Number of future covariates that are available only in the future.
forecast_horizon (int, optional) – Forecast horizon (number of future time steps to predict).
48. (Defaults to)
lookback_window_size (int, optional) – Size of the input window (number of historical time steps).
96. (Defaults to)
dropout (float, optional) – Dropout rate for regularization, must be between 0 and 1. Defaults to 0.25.
alpha (float, optional) – Weighting parameter for certain loss or regularization functions. Defaults to 0.1.
output_activation (str, optional) – Activation function for the output layer. Defaults to “Identity”.

__init__(num_target_feature, num_historical_features, num_calendar_features, num_exogenous_features, num_future_covariates, forecast_horizon, lookback_window_size, dropout=0.25, alpha=0.1, output_activation='Identity')#

Initializes the BaseArchitecture class.

Parameters:

num_target_feature (int) – Number of target series to predict.
num_historical_features (int) – Number of historical features.
num_calendar_features (int) – Number of calendar-based features.
num_exogenous_features (int) – Number of exogenous features.
num_future_covariates (int) – Number of future covariates that are available only in the future.
forecast_horizon (int) – Forecast horizon (number of future time steps to predict).
lookback_window_size (int) – Size of the input window (number of historical time steps).
dropout (float | None) – Dropout rate for regularization, must be between 0 and 1. Defaults to 0.25.
alpha (float) – Weighting parameter for certain loss or regularization functions. Defaults to 0.1.
output_activation (str) – Activation function for the output layer. Defaults to “Identity”.

encode(x)#

Return the latent representation used by probabilistic heads.

Override in subclasses that support probabilistic forecasting.

Parameters:: x (Tensor) – Input tensor of shape (B, T, F).
Return type:: Tensor
Returns:: Latent tensor of shape (B, encode_size).
Raises:: NotImplementedError – If the subclass has not implemented this method.

property encode_size: int#

Width of the latent vector returned by encode().

Set self._encode_size in subclass __init__ before calling super().__init__, or override this property directly.

forecast(x)#

Generates forecasts for the input sequences.

Parameters:

x (Tensor) – Input tensor containing time series data.

Return type:

dict[str, Tensor]

Returns:

dict –

A dictionary with the predicted forecast, where the key is ‘predicted_values’ and: the value is a tensor of predicted values.

abstractmethod forward(x)#

Defines the computation performed at every call. Should be overridden by all subclasses.

Parameters:: x (Tensor) – Input tensor containing time series data.
Return type:: Tensor
Returns:: torch.Tensor – Output tensor containing the forecasted values.

penalty_dict(epoch=None)#

Return backbone-specific regularisation penalties.

Override in subclasses that apply regularisation (Lasso, gate sparsity, …). The default returns an empty dict - no penalty.

Parameters:: epoch (int | None) – Current training epoch, forwarded for warmup-based schedules.
Return type:: dict[str, Tensor]
Returns:: Mapping of penalty name → scalar loss tensor.

set_num_feature_covariate_column()#

step(batch, metric_fn, epoch=None)#

Perform a single training/validation step.

Computes the combined MSE + MAE loss, adds and evaluates the provided metric function.

Parameters:

batch (tuple[Tensor, Tensor]) – Tuple of (input features, target values) - x: shape [batch_size, input_features] - y: shape [batch_size, output_size] or [batch_size]
metric_fn (Callable[..., Any]) – Callable that takes predictions and targets and returns a scalar metric (often used for logging/validation, e.g. MAE, RMSE, MAPE, etc.)
epoch (int | None) – Current training epoch number (passed to regularization scheduler). If None, no epoch-dependent behavior is applied.

Return type:

tuple[Tensor, Tensor]

Returns:

Tuple containing – - loss: Total loss value (MSE/MAE combination) - metric: Value returned by metric_fn(pred, target)

Probabilistic core#

class twiga.models.nn.prob.core.ProbabilisticNetwork(backbone, head)#

Bases: Module

Generic backbone + distribution-head network.

Composes any backbone that implements encode(x) → z with any distribution head that implements step(z, y, metric_fn, epoch).

Parameters:

backbone (Module) – Instantiated backbone network.
head (Module) – Instantiated distribution head.

forecast(x)#

Inference-mode forecast via the head’s forecast method.

Encodes x with the backbone then delegates to self.head.forecast(z) so that heads with richer outputs (e.g. QR returning loc + quantiles + quantile_levels) produce the full dict rather than the raw tensor returned by their forward() method.

Parameters:: x (Tensor) – Input tensor of shape (B, T, F).
Return type:: tuple[Tensor, ...] | dict
Returns:: Whatever self.head.forecast(z) returns - always a dict with at least "loc" key (parametric also has "scale"; quantile heads add "quantiles" and "quantile_levels").

forward(x)#

Encode input and return distribution parameters.

Parameters:: x (Tensor) – Input tensor of shape (B, T, F).
Return type:: tuple[Tensor, ...]
Returns:: Tuple of distribution parameter tensors from the head.

step(batch, metric_fn, epoch=None)#

Single training/validation step.

Encodes the input, delegates to the head’s step(), then adds any backbone regularisation penalties returned by penalty_dict().

When the backbone returns non-empty penalties the loss is promoted to a dict so BaseNeuralModel.training_step can log each component individually.

Parameters:

batch (tuple[Tensor, Tensor]) – (x, y) pair.
metric_fn (Callable) – Callable returning a scalar metric.
epoch (int | None) – Current epoch (forwarded to penalty_dict for warmup).

Return type:

tuple[Tensor | dict, Tensor]

Returns:

(loss_or_dict, metric)

step_sigma(batch, metric_fn)#

Sigma-only step for two-stage training.

Called by BaseNeuralModel.training_step when two optimizers are configured. The backbone is already frozen by Lightning’s toggle_optimizer at the call site; this method encodes the input and delegates to the head’s step_sigma() so only the sigma MLP receives gradient updates.

Parameters:

batch (tuple[Tensor, Tensor]) – (x, y) pair.
metric_fn (Callable) – Callable returning a scalar metric.

Return type:

tuple[Tensor, Tensor]

Returns:

(sigma_loss, metric)

class twiga.models.nn.prob.core.ProbabilisticModel(backbone_cls, backbone_kwargs, head_cls, head_kwargs, metric='mae', optimizer_type=None, lr_scheduler_type=None, checkpoints_path='./', optimizer_params=None, scheduler_params=None, max_epochs=10)#

Bases: BaseNeuralModel

Generic Lightning wrapper for backbone + distribution head.

Instantiates backbone_cls(**backbone_kwargs) then head_cls(hidden_size=backbone.encode_size, **head_kwargs), automatically injecting the backbone’s latent size into the head so callers never need to specify hidden_size explicitly.

Parameters:

backbone_cls (type) – Backbone class (e.g. MLPFNetwork).
backbone_kwargs (dict) – Constructor kwargs forwarded to the backbone.
head_cls (type) – Distribution head class (e.g. NormalDistribution).
head_kwargs (dict) – Constructor kwargs forwarded to the head. Do not include hidden_size - it is injected automatically.
metric (str) – Training/validation metric. Defaults to 'mae'.
optimizer_type (str | None) – Optimizer name (e.g. 'adamw'). Defaults to None (falls back to BaseNeuralModel default of 'adamw').
lr_scheduler_type (str | None) – Scheduler name (e.g. 'cosine_annealing_lr'). Defaults to None (falls back to 'multi_step').
checkpoints_path (str) – Directory for saving Lightning checkpoints. Defaults to './'.
optimizer_params (dict | None) – Override for BaseNeuralModel.OPTIMIZERS dict.
scheduler_params (dict | None) – Override for BaseNeuralModel.SCHEDULERS dict.
max_epochs (int) – Maximum training epochs. Defaults to 10.

forward(x)#

Return distribution parameters for the input batch.

Parameters:: x (Tensor) – Input tensor.
Return type:: tuple[Tensor, ...]
Returns:: Tuple of distribution parameter tensors.

MLPF — point model#

class twiga.models.nn.mlpf_model.MLPFConfig(**data)#

Bases: BaseMLPConfig

Configuration for the MLPF (Multi-Layer Perceptron Fusion) forecasting model.

Extends the shared MLP base configuration with fusion-specific parameters for combining past and future representations (attention, weighted sum, or addition).

Supports patch-based value embedding as an additional option compared to the base MLP configurations.

Variables:

name – Fixed model identifier (“mlpf”)
combination_type – Strategy for fusing encoded past and future features
num_attention_heads – Number of attention heads (used when combination_type = “attn-comb”)
patch_len – Length of each patch when using PatchEmb value embedding
stride – Step size between consecutive patches (when patch_len is set)
search_space – Hyperparameter ranges for Optuna HPO, extending the base MLP space

combination_type: Literal['attn-comb', 'weighted-comb', 'addition-comb']#

distribution: Literal['normal', 'laplace', 'lognormal', 'gamma', 'beta', 'qr', 'fpqr', 'crc'] | None#

model_config: ClassVar[ConfigDict] = {'extra': 'allow'}#: Configuration for the model, should be a dictionary conforming to [ConfigDict][pydantic.config.ConfigDict].

name: Literal['mlpf']#

num_attention_heads: int#

patch_len: int | None#

search_space: BaseSearchSpace#

stride: int | None#

class twiga.models.nn.mlpf_model.MLPFModel(model_config)#

Bases: BaseNeuralForecast

Training lifecycle wrapper for MLPFNetwork.

Instantiates MLPFModel from an MLPFConfig, manages training via PyTorch Lightning, and supports Optuna hyperparameter optimisation.

Example

>>> config = MLPFConfig(
...     num_target_feature=1,
...     num_historical_features=5,
...     forecast_horizon=24,
...     lookback_window_size=96,
... )
>>> model = MLPFModel(model_config=config)

__init__(model_config)#

Initialise with a validated MLPFConfig.

Parameters:: model_config (MLPFConfig) – Full model configuration.

load_checkpoint()#

Load the latest checkpoint and set model to eval mode.

Return type:: None

update(trial)#

Re-initialise model with Optuna-suggested hyperparameters.

Parameters:: trial (Trial) – Optuna trial providing parameter suggestions via the search space defined in MLPFConfig.search_space.
Return type:: None

MLPF — parametric variants#

class twiga.models.nn.mlpfnormal_model.MLPFNormalConfig(**data)#

Bases: MLPFConfig

Configuration for Normal-distribution probabilistic MLPF.

Best for: symmetric, unbounded targets - energy demand, temperature.

Variables:

name – Fixed model identifier.
search_space – Optuna hyperparameter search space.

model_config: ClassVar[ConfigDict] = {'extra': 'allow'}#: Configuration for the model, should be a dictionary conforming to [ConfigDict][pydantic.config.ConfigDict].

name: Literal['mlpfnormal']#

search_space: BaseSearchSpace#

class twiga.models.nn.mlpfnormal_model.MLPFNormalModel(model_config)#

Bases: BaseNeuralForecast

Training lifecycle wrapper for Normal-distribution probabilistic MLPF.

Example

>>> config = MLPFNormalConfig(
...     num_target_feature=1,
...     num_historical_features=5,
...     forecast_horizon=24,
...     lookback_window_size=96,
... )
>>> model = MLPFNormalModel(model_config=config)

load_checkpoint()#

Load model from a checkpoint.

Parameters:: checkpoints_path – Path to the checkpoint file.
Return type:: None

update(trial)#

Update the model with new hyperparameters.

Parameters:: trial (Trial) – Optuna trial object containing hyperparameters.
Return type:: None

class twiga.models.nn.mlpflaplace_model.MLPFLaplaceConfig(**data)#

Bases: MLPFConfig

Configuration for Laplace-distribution probabilistic MLPF.

Best for: heavy-tailed residuals robust to outliers - electricity prices, wind speed residuals.

Variables:

name – Fixed model identifier.
search_space – Optuna hyperparameter search space.

model_config: ClassVar[ConfigDict] = {'extra': 'allow'}#: Configuration for the model, should be a dictionary conforming to [ConfigDict][pydantic.config.ConfigDict].

name: Literal['mlpflaplace']#

search_space: BaseSearchSpace#

class twiga.models.nn.mlpflaplace_model.MLPFLaplaceModel(model_config)#

Bases: BaseNeuralForecast

Training lifecycle wrapper for Laplace-distribution probabilistic MLPF.

Example

>>> config = MLPFLaplaceConfig(
...     num_target_feature=1,
...     num_historical_features=5,
...     forecast_horizon=24,
...     lookback_window_size=96,
... )
>>> model = MLPFLaplaceModel(model_config=config)

load_checkpoint()#

Load model from a checkpoint.

Parameters:: checkpoints_path – Path to the checkpoint file.
Return type:: None

update(trial)#

Update the model with new hyperparameters.

Parameters:: trial (Trial) – Optuna trial object containing hyperparameters.
Return type:: None

class twiga.models.nn.mlpflognormal_model.MLPFLogNormalConfig(**data)#

Bases: MLPFConfig

Configuration for Log-Normal-distribution probabilistic MLPF.

Best for: strictly positive, right-skewed targets - renewable generation, gas prices, load with zero floor.

Note

Targets must be strictly positive. Apply target = target.clamp(min=1e-6) in the data pipeline if zeros are possible.

Variables:

name – Fixed model identifier.
search_space – Optuna hyperparameter search space.

model_config: ClassVar[ConfigDict] = {'extra': 'allow'}#: Configuration for the model, should be a dictionary conforming to [ConfigDict][pydantic.config.ConfigDict].

name: Literal['mlpflognormal']#

search_space: BaseSearchSpace#

class twiga.models.nn.mlpflognormal_model.MLPFLogNormalModel(model_config)#

Bases: BaseNeuralForecast

Training lifecycle wrapper for Log-Normal-distribution probabilistic MLPF.

Example

>>> config = MLPFLogNormalConfig(
...     num_target_feature=1,
...     num_historical_features=5,
...     forecast_horizon=24,
...     lookback_window_size=96,
... )
>>> model = MLPFLogNormalModel(model_config=config)

load_checkpoint()#

Load model from a checkpoint.

Parameters:: checkpoints_path – Path to the checkpoint file.
Return type:: None

update(trial)#

Update the model with new hyperparameters.

Parameters:: trial (Trial) – Optuna trial object containing hyperparameters.
Return type:: None

class twiga.models.nn.mlpfgamma_model.MLPFGammaConfig(**data)#

Bases: MLPFConfig

Configuration for Gamma-distribution probabilistic MLPF.

Best for: strictly positive targets with flexible skew - solar irradiance, wind power, aggregate load.

Note

Targets must be strictly positive. GammaDistribution does not accept an out_activation_function - both shape parameters are constrained internally via softplus.

Variables:

name – Fixed model identifier.
out_activation_function – Fixed to Identity (not used by Gamma head).
search_space – Optuna hyperparameter search space.

model_config: ClassVar[ConfigDict] = {'extra': 'allow'}#: Configuration for the model, should be a dictionary conforming to [ConfigDict][pydantic.config.ConfigDict].

name: Literal['mlpfgamma']#

out_activation_function: Literal['Identity']#

search_space: BaseSearchSpace#

class twiga.models.nn.mlpfgamma_model.MLPFGammaModel(model_config)#

Bases: BaseNeuralForecast

Training lifecycle wrapper for Gamma-distribution probabilistic MLPF.

Example

>>> config = MLPFGammaConfig(
...     num_target_feature=1,
...     num_historical_features=5,
...     forecast_horizon=24,
...     lookback_window_size=96,
... )
>>> model = MLPFGammaModel(model_config=config)

load_checkpoint()#

Load model from a checkpoint.

Parameters:: checkpoints_path – Path to the checkpoint file.
Return type:: None

update(trial)#

Update the model with new hyperparameters.

Parameters:: trial (Trial) – Optuna trial object containing hyperparameters.
Return type:: None

class twiga.models.nn.mlpfbeta_model.MLPFBetaConfig(**data)#

Bases: MLPFConfig

Configuration for Beta-distribution probabilistic MLPF.

Best for: strictly bounded [0, 1] targets - capacity factors, state of charge, normalised demand ratios.

Note

Targets must lie strictly in (0, 1). Apply target = target.clamp(1e-6, 1 - 1e-6) in the data pipeline if boundary values are possible.

Variables:

name – Fixed model identifier.
out_activation_function – Fixed to Identity (not used by Beta head).
search_space – Optuna hyperparameter search space.

model_config: ClassVar[ConfigDict] = {'extra': 'allow'}#: Configuration for the model, should be a dictionary conforming to [ConfigDict][pydantic.config.ConfigDict].

name: Literal['mlpfbeta']#

out_activation_function: Literal['Identity']#

search_space: BaseSearchSpace#

class twiga.models.nn.mlpfbeta_model.MLPFBetaModel(model_config)#

Bases: BaseNeuralForecast

Training lifecycle wrapper for Beta-distribution probabilistic MLPF.

Example

>>> config = MLPFBetaConfig(
...     num_target_feature=1,
...     num_historical_features=5,
...     forecast_horizon=24,
...     lookback_window_size=96,
... )
>>> model = MLPFBetaModel(model_config=config)

load_checkpoint()#

Load model from a checkpoint.

Parameters:: checkpoints_path – Path to the checkpoint file.
Return type:: None

update(trial)#

Update the model with new hyperparameters.

Parameters:: trial (Trial) – Optuna trial object containing hyperparameters.
Return type:: None

class twiga.models.nn.mlpfstudentt_model.MLPFStudentTConfig(**data)#

Bases: MLPFConfig

Configuration for Student-T distribution probabilistic MLPF.

Best for: very heavy-tailed targets - spot electricity prices, financial returns.

Variables:

name – Fixed model identifier.
min_df – Minimum degrees of freedom for the Student-T distribution. Values close to 2 yield very heavy tails; higher values approach Normal.
search_space – Optuna hyperparameter search space.

min_df: float#

model_config: ClassVar[ConfigDict] = {'extra': 'allow'}#: Configuration for the model, should be a dictionary conforming to [ConfigDict][pydantic.config.ConfigDict].

name: Literal['mlpfstudentt']#

search_space: BaseSearchSpace#

class twiga.models.nn.mlpfstudentt_model.MLPFStudentTModel(model_config)#

Bases: BaseNeuralForecast

Training lifecycle wrapper for Student-T distribution probabilistic MLPF.

Example

>>> config = MLPFStudentTConfig(
...     num_target_feature=1,
...     num_historical_features=5,
...     forecast_horizon=24,
...     lookback_window_size=96,
... )
>>> model = MLPFStudentTModel(model_config=config)

load_checkpoint()#

Load model from a checkpoint.

Parameters:: checkpoints_path – Path to the checkpoint file.
Return type:: None

update(trial)#

Update the model with new hyperparameters.

Parameters:: trial (Trial) – Optuna trial object containing hyperparameters.
Return type:: None

MLPF — quantile variants#

class twiga.models.nn.mlpfqr_model.MLPFQRConfig(**data)#

Bases: MLPFConfig

Configuration for the MLPFQR fixed-grid quantile forecasting model.

Extends MLPFConfig with quantile regression parameters. The encoder architecture (latent_size, hidden_dim, width_multiplier, use_norm, etc.) is fully inherited and tunable.

Variables:

name – Fixed model identifier.
quantiles – Explicit quantile levels to forecast. If None, defaults inside QRDistribution apply.
conf_level – Confidence level for symmetric interval construction.
loss_fn – Pinball or Huber-pinball quantile loss.
kappa – Huber transition parameter (only relevant for ‘huber-pinball’).
eps – Numerical stability epsilon.
search_space – Optuna hyperparameter search space.

conf_level: float#

crossing_penalty: float#

eps: float#

gradient_clip_val: float | None#

kappa: float#

loss_fn: Literal['pinball', 'huber-pinball']#

model_config: ClassVar[ConfigDict] = {'extra': 'allow'}#: Configuration for the model, should be a dictionary conforming to [ConfigDict][pydantic.config.ConfigDict].

name: Literal['mlpfqr']#

quantiles: list[float] | None#

search_space: BaseSearchSpace#

class twiga.models.nn.mlpfqr_model.MLPFQRModel(model_config)#

Bases: BaseNeuralForecast

Training lifecycle wrapper for fixed-grid quantile regression (MLPFQR).

Instantiates MLPFQR from an MLPFQRConfig, manages training via PyTorch Lightning, and supports Optuna hyperparameter optimisation.

Example

>>> config = MLPFQRConfig(
...     num_target_feature=1,
...     num_historical_features=5,
...     forecast_horizon=24,
...     lookback_window_size=96,
...     quantiles=[0.1, 0.5, 0.9],
... )
>>> model = MLPFQRModel(model_config=config)

__init__(model_config)#

Initialise with a validated MLPFQRConfig.

Parameters:: model_config (MLPFQRConfig) – Full model configuration.

load_checkpoint()#

Load the latest checkpoint and set model to eval mode.

Return type:: None

update(trial)#

Re-initialise model with Optuna-suggested hyperparameters.

Parameters:: trial (Trial) – Optuna trial providing parameter suggestions via the search space defined in MLPFQRConfig.search_space.
Return type:: None

class twiga.models.nn.mlpffpqr_model.MLPFFPQRConfig(**data)#

Bases: MLPFConfig

Configuration for the MLPFFPQR flexible quantile proposal forecasting model.

Extends MLPFConfig with FPQR-specific parameters. The encoder architecture (hidden_dim, width_multiplier, use_norm, etc.) is fully inherited and tunable. Unlike fixed-grid quantile regression, the model adaptively proposes its own quantile grid during training via FPQRDistribution.

Variables:

name – Fixed model identifier.
latent_size – Encoder output dimension fed into the FPQR distribution head. Note: FPQR uses latent_size rather than the inherited latent_size because the distribution head’s hidden_dim is sized to match the encoder’s output directly.
n_quantiles – Number of quantile levels for the proposal network.
conf_level – Confidence level for symmetric interval construction.
loss_fn – Pinball or Huber-pinball quantile loss.
kappa – Huber transition parameter (only relevant for ‘huber-pinball’).
num_cosines – Number of cosine basis functions in the quantile embedding layer. Higher values allow finer-grained quantile positioning.
search_space – Optuna hyperparameter search space.

conf_level: float#

gradient_clip_val: float | None#

kappa: float#

latent_size: int#

loss_fn: Literal['pinball', 'huber-pinball']#

model_config: ClassVar[ConfigDict] = {'extra': 'allow'}#: Configuration for the model, should be a dictionary conforming to [ConfigDict][pydantic.config.ConfigDict].

n_quantiles: int#

name: Literal['mlpffpqr']#

num_cosines: int#

search_space: BaseSearchSpace#

class twiga.models.nn.mlpffpqr_model.MLPFFPQRModel(model_config)#

Bases: BaseNeuralForecast

Training lifecycle wrapper for flexible quantile proposal regression (MLPFFPQR).

Instantiates MLPFFPQR from an MLPFFPQRConfig, manages training via PyTorch Lightning, and supports Optuna hyperparameter optimisation.

Example

>>> config = MLPFFPQRConfig(
...     num_target_feature=1,
...     num_historical_features=5,
...     forecast_horizon=24,
...     lookback_window_size=96,
...     n_quantiles=9,
... )
>>> model = MLPFFPQRModel(model_config=config)

__init__(model_config)#

Initialise with a validated MLPFFPQRConfig.

Parameters:: model_config (MLPFFPQRConfig) – Full model configuration.

load_checkpoint()#

Load the latest checkpoint and set model to eval mode.

Return type:: None

update(trial)#

Re-initialise model with Optuna-suggested hyperparameters.

Parameters:: trial (Trial) – Optuna trial providing parameter suggestions via the search space defined in MLPFFPQRConfig.search_space.
Return type:: None

class twiga.models.nn.mlpfcrc_model.MLPFCRCConfig(**data)#

Bases: MLPFConfig

Configuration for MLPF + Conditional Residual Calibration.

Pairs the MLPF backbone with a CRC uncertainty head that approximates absolute residuals. Inherits all MLPF architecture parameters.

Variables:

name – Fixed model identifier.
search_space – Optuna hyperparameter search space.

model_config: ClassVar[ConfigDict] = {'extra': 'allow'}#: Configuration for the model, should be a dictionary conforming to [ConfigDict][pydantic.config.ConfigDict].

name: Literal['mlpfcrc']#

search_space: BaseSearchSpace#

sigma_loss_fn: Literal['gaussian', 'laplace', 'mse', 'hybrid', 'hybrid_sqrt', 'log']#

stage1_epochs: int | None#

two_stage: bool#

class twiga.models.nn.mlpfcrc_model.MLPFCRCModel(model_config)#

Bases: BaseNeuralForecast

Training lifecycle wrapper for MLPF + CRC.

Example

>>> config = MLPFCRCConfig(
...     num_target_feature=1,
...     num_historical_features=5,
...     forecast_horizon=24,
...     lookback_window_size=96,
... )
>>> model = MLPFCRCModel(model_config=config)

load_checkpoint()#

Load model from a checkpoint.

Parameters:: checkpoints_path – Path to the checkpoint file.
Return type:: None

update(trial)#

Update the model with new hyperparameters.

Parameters:: trial (Trial) – Optuna trial object containing hyperparameters.
Return type:: None

MLPGAM — point model#

class twiga.models.nn.mlpgam_model.MLPGAMConfig(**data)#

Bases: BaseMLPConfig

Configuration for the MLPGAM (Generalized Additive Model-inspired MLP) forecasting model.

Extends the shared MLP base configuration with a lightweight L1 (Lasso) penalty applied specifically to the final projection / readout weights. This encourages sparsity and improves interpretability of feature importance in the output layer.

Variables:

name – Fixed model identifier (“mlpgam”).
lambda_lasso – L1 (Lasso) regularization strength applied to the final projection weights. Higher values promote sparsity in the output mapping.
search_space – Hyperparameter ranges for Optuna-based tuning. Extends the base MLP search space with the lambda_lasso parameter.

distribution: Literal['normal', 'laplace', 'lognormal', 'gamma', 'beta', 'studentt', 'qr', 'fpqr', 'crc'] | None#

lambda_lasso: float#

model_config: ClassVar[ConfigDict] = {'extra': 'allow'}#: Configuration for the model, should be a dictionary conforming to [ConfigDict][pydantic.config.ConfigDict].

name: Literal['mlpgam']#

search_space: BaseSearchSpace#

class twiga.models.nn.mlpgam_model.MLPGAMModel(model_config)#

Bases: BaseNeuralForecast

Training lifecycle wrapper for MLPGAMNetwork.

Instantiates MLPGAM from an MLPGAMConfig, manages training via PyTorch Lightning, and supports Optuna hyperparameter optimisation.

Example

>>> config = MLPGAMConfig(
...     num_target_feature=1,
...     num_historical_features=5,
...     forecast_horizon=24,
...     lookback_window_size=96,
... )
>>> model = MLPGAMModel(model_config=config)

__init__(model_config)#

Initialise with a validated MLPGAMConfig.

Parameters:: model_config (MLPGAMConfig) – Full model configuration.

load_checkpoint()#

Load the latest checkpoint and set model to eval mode.

Return type:: None

update(trial)#

Re-initialise model with Optuna-suggested hyperparameters.

Parameters:: trial (Trial) – Optuna trial providing parameter suggestions via the search space defined in MLPGAMConfig.search_space.
Return type:: None

MLPGAM — parametric variants#

class twiga.models.nn.mlpgamnormal_model.MLPGAMNormalConfig(**data)#

Bases: MLPGAMConfig

Configuration for Normal-distribution probabilistic MLPGAM.

Best for: symmetric, unbounded targets - energy demand, temperature.

Variables:

name – Fixed model identifier.
search_space – Optuna hyperparameter search space.

model_config: ClassVar[ConfigDict] = {'extra': 'allow'}#: Configuration for the model, should be a dictionary conforming to [ConfigDict][pydantic.config.ConfigDict].

name: Literal['mlpgamnormal']#

search_space: BaseSearchSpace#

class twiga.models.nn.mlpgamnormal_model.MLPGAMNormalModel(model_config)#

Bases: BaseNeuralForecast

Training lifecycle wrapper for Normal-distribution probabilistic MLPGAM.

Example

>>> config = MLPGAMNormalConfig(
...     num_target_feature=1,
...     num_historical_features=5,
...     forecast_horizon=24,
...     lookback_window_size=96,
... )
>>> model = MLPGAMNormalModel(model_config=config)

load_checkpoint()#

Load model from a checkpoint.

Parameters:: checkpoints_path – Path to the checkpoint file.
Return type:: None

update(trial)#

Update the model with new hyperparameters.

Parameters:: trial (Trial) – Optuna trial object containing hyperparameters.
Return type:: None

class twiga.models.nn.mlpgamlaplace_model.MLPGAMLaplaceConfig(**data)#

Bases: MLPGAMConfig

Configuration for Laplace-distribution probabilistic MLPGAM.

Best for: heavy-tailed residuals robust to outliers - electricity prices, wind speed residuals.

Variables:

name – Fixed model identifier.
search_space – Optuna hyperparameter search space.

model_config: ClassVar[ConfigDict] = {'extra': 'allow'}#: Configuration for the model, should be a dictionary conforming to [ConfigDict][pydantic.config.ConfigDict].

name: Literal['mlpgamlaplace']#

search_space: BaseSearchSpace#

class twiga.models.nn.mlpgamlaplace_model.MLPGAMLaplaceModel(model_config)#

Bases: BaseNeuralForecast

Training lifecycle wrapper for Laplace-distribution probabilistic MLPGAM.

Example

>>> config = MLPGAMLaplaceConfig(
...     num_target_feature=1,
...     num_historical_features=5,
...     forecast_horizon=24,
...     lookback_window_size=96,
... )
>>> model = MLPGAMLaplaceModel(model_config=config)

load_checkpoint()#

Load model from a checkpoint.

Parameters:: checkpoints_path – Path to the checkpoint file.
Return type:: None

update(trial)#

Update the model with new hyperparameters.

Parameters:: trial (Trial) – Optuna trial object containing hyperparameters.
Return type:: None

class twiga.models.nn.mlpgamlognormal_model.MLPGAMLogNormalConfig(**data)#

Bases: MLPGAMConfig

Configuration for LogNormal-distribution probabilistic MLPGAM.

Best for: strictly positive, right-skewed targets - gas prices, generation ramp-up events.

Variables:

name – Fixed model identifier.
search_space – Optuna hyperparameter search space.

model_config: ClassVar[ConfigDict] = {'extra': 'allow'}#: Configuration for the model, should be a dictionary conforming to [ConfigDict][pydantic.config.ConfigDict].

name: Literal['mlpgamlognormal']#

search_space: BaseSearchSpace#

class twiga.models.nn.mlpgamlognormal_model.MLPGAMLogNormalModel(model_config)#

Bases: BaseNeuralForecast

Training lifecycle wrapper for LogNormal-distribution probabilistic MLPGAM.

Example

>>> config = MLPGAMLogNormalConfig(
...     num_target_feature=1,
...     num_historical_features=5,
...     forecast_horizon=24,
...     lookback_window_size=96,
... )
>>> model = MLPGAMLogNormalModel(model_config=config)

load_checkpoint()#

Load model from a checkpoint.

Parameters:: checkpoints_path – Path to the checkpoint file.
Return type:: None

update(trial)#

Update the model with new hyperparameters.

Parameters:: trial (Trial) – Optuna trial object containing hyperparameters.
Return type:: None

class twiga.models.nn.mlpgamgamma_model.MLPGAMGammaConfig(**data)#

Bases: MLPGAMConfig

Configuration for Gamma-distribution probabilistic MLPGAM.

Best for: strictly positive targets with flexible skew - solar irradiance, wind power, aggregate load.

Note

Targets must be strictly positive. GammaDistribution constrains both shape parameters internally via softplus; out_activation_function is fixed to Identity and excluded from tuning.

Variables:

name – Fixed model identifier.
out_activation_function – Fixed to Identity (not used by Gamma head).
search_space – Optuna hyperparameter search space.

model_config: ClassVar[ConfigDict] = {'extra': 'allow'}#: Configuration for the model, should be a dictionary conforming to [ConfigDict][pydantic.config.ConfigDict].

name: Literal['mlpgamgamma']#

out_activation_function: Literal['Identity']#

search_space: BaseSearchSpace#

class twiga.models.nn.mlpgamgamma_model.MLPGAMGammaModel(model_config)#

Bases: BaseNeuralForecast

Training lifecycle wrapper for Gamma-distribution probabilistic MLPGAM.

Example

>>> config = MLPGAMGammaConfig(
...     num_target_feature=1,
...     num_historical_features=5,
...     forecast_horizon=24,
...     lookback_window_size=96,
... )
>>> model = MLPGAMGammaModel(model_config=config)

load_checkpoint()#

Load model from a checkpoint.

Parameters:: checkpoints_path – Path to the checkpoint file.
Return type:: None

update(trial)#

Update the model with new hyperparameters.

Parameters:: trial (Trial) – Optuna trial object containing hyperparameters.
Return type:: None

class twiga.models.nn.mlpgambeta_model.MLPGAMBetaConfig(**data)#

Bases: MLPGAMConfig

Configuration for Beta-distribution probabilistic MLPGAM.

Best for: strictly bounded [0, 1] targets - capacity factors, state of charge, normalised demand ratios.

Note

Targets must lie strictly in (0, 1). Apply target = target.clamp(1e-6, 1 - 1e-6) in the data pipeline if boundary values are possible.

Variables:

name – Fixed model identifier.
out_activation_function – Fixed to Identity (not used by Beta head).
search_space – Optuna hyperparameter search space.

model_config: ClassVar[ConfigDict] = {'extra': 'allow'}#: Configuration for the model, should be a dictionary conforming to [ConfigDict][pydantic.config.ConfigDict].

name: Literal['mlpgambeta']#

out_activation_function: Literal['Identity']#

search_space: BaseSearchSpace#

class twiga.models.nn.mlpgambeta_model.MLPGAMBetaModel(model_config)#

Bases: BaseNeuralForecast

Training lifecycle wrapper for Beta-distribution probabilistic MLPGAM.

Example

>>> config = MLPGAMBetaConfig(
...     num_target_feature=1,
...     num_historical_features=5,
...     forecast_horizon=24,
...     lookback_window_size=96,
... )
>>> model = MLPGAMBetaModel(model_config=config)

load_checkpoint()#

Load model from a checkpoint.

Parameters:: checkpoints_path – Path to the checkpoint file.
Return type:: None

update(trial)#

Update the model with new hyperparameters.

Parameters:: trial (Trial) – Optuna trial object containing hyperparameters.
Return type:: None

class twiga.models.nn.mlpgamstudentt_model.MLPGAMStudentTConfig(**data)#

Bases: MLPGAMConfig

Configuration for Student-T distribution probabilistic MLPGAM.

Best for: very heavy-tailed targets - spot electricity prices, financial returns.

Variables:

name – Fixed model identifier.
min_df – Minimum degrees of freedom for the Student-T distribution. Values close to 2 yield very heavy tails; higher values approach Normal.
search_space – Optuna hyperparameter search space.

min_df: float#

model_config: ClassVar[ConfigDict] = {'extra': 'allow'}#: Configuration for the model, should be a dictionary conforming to [ConfigDict][pydantic.config.ConfigDict].

name: Literal['mlpgamstudentt']#

search_space: BaseSearchSpace#

class twiga.models.nn.mlpgamstudentt_model.MLPGAMStudentTModel(model_config)#

Bases: BaseNeuralForecast

Training lifecycle wrapper for Student-T distribution probabilistic MLPGAM.

Example

>>> config = MLPGAMStudentTConfig(
...     num_target_feature=1,
...     num_historical_features=5,
...     forecast_horizon=24,
...     lookback_window_size=96,
... )
>>> model = MLPGAMStudentTModel(model_config=config)

load_checkpoint()#

Load model from a checkpoint.

Parameters:: checkpoints_path – Path to the checkpoint file.
Return type:: None

update(trial)#

Update the model with new hyperparameters.

Parameters:: trial (Trial) – Optuna trial object containing hyperparameters.
Return type:: None

MLPGAM — quantile and CRC variants#

class twiga.models.nn.mlpgamqr_model.MLPGAMQRConfig(**data)#

Bases: MLPGAMConfig

Configuration for the MLPGAM fixed-grid quantile forecasting model.

Extends MLPGAMConfig with quantile regression parameters. The GAM encoder architecture (latent_size, hidden_dim, lambda_lasso, etc.) is fully inherited and tunable.

Variables:

name – Fixed model identifier.
quantiles – Explicit quantile levels to forecast. If None, defaults inside QRDistribution apply.
conf_level – Confidence level for symmetric interval construction.
loss_fn – Pinball or Huber-pinball quantile loss.
kappa – Huber transition parameter (only relevant for ‘huber-pinball’).
eps – Numerical stability epsilon.
search_space – Optuna hyperparameter search space.

conf_level: float#

crossing_penalty: float#

eps: float#

gradient_clip_val: float | None#

kappa: float#

loss_fn: Literal['pinball', 'huber-pinball']#

model_config: ClassVar[ConfigDict] = {'extra': 'allow'}#: Configuration for the model, should be a dictionary conforming to [ConfigDict][pydantic.config.ConfigDict].

name: Literal['mlpgamqr']#

quantiles: list[float] | None#

search_space: BaseSearchSpace#

class twiga.models.nn.mlpgamqr_model.MLPGAMQRModel(model_config)#

Bases: BaseNeuralForecast

Training lifecycle wrapper for fixed-grid quantile regression (MLPGAMQR).

Instantiates MLPGAMQR from an MLPGAMQRConfig, manages training via PyTorch Lightning, and supports Optuna hyperparameter optimisation.

Example

>>> config = MLPGAMQRConfig(
...     num_target_feature=1,
...     num_historical_features=5,
...     forecast_horizon=24,
...     lookback_window_size=96,
...     quantiles=[0.1, 0.5, 0.9],
... )
>>> model = MLPGAMQRModel(model_config=config)

load_checkpoint()#

Load model from a checkpoint.

Parameters:: checkpoints_path – Path to the checkpoint file.
Return type:: None

update(trial)#

Update the model with new hyperparameters.

Parameters:: trial (Trial) – Optuna trial object containing hyperparameters.
Return type:: None

class twiga.models.nn.mlpgamfpqr_model.MLPGAMFPQRConfig(**data)#

Bases: MLPGAMConfig

Configuration for the MLPGAM flexible quantile proposal forecasting model.

Extends MLPGAMConfig with FPQR-specific parameters. Unlike fixed-grid quantile regression, this model adaptively proposes its own quantile grid during training via FPQRDistribution.

Variables:

name – Fixed model identifier.
n_quantiles – Number of quantile levels for the proposal network.
conf_level – Confidence level for symmetric interval construction.
loss_fn – Pinball or Huber-pinball quantile loss.
kappa – Huber transition parameter (only relevant for ‘huber-pinball’).
num_cosines – Cosine basis functions for the quantile embedding layer.
search_space – Optuna hyperparameter search space.

conf_level: float#

gradient_clip_val: float | None#

kappa: float#

loss_fn: Literal['pinball', 'huber-pinball']#

model_config: ClassVar[ConfigDict] = {'extra': 'allow'}#: Configuration for the model, should be a dictionary conforming to [ConfigDict][pydantic.config.ConfigDict].

n_quantiles: int#

name: Literal['mlpgamfpqr']#

num_cosines: int#

search_space: BaseSearchSpace#

class twiga.models.nn.mlpgamfpqr_model.MLPGAMFPQRModel(model_config)#

Bases: BaseNeuralForecast

Training lifecycle wrapper for flexible quantile proposal regression (MLPGAMFPQR).

Instantiates MLPGAMFPQR from an MLPGAMFPQRConfig, manages training via PyTorch Lightning, and supports Optuna hyperparameter optimisation.

Example

>>> config = MLPGAMFPQRConfig(
...     num_target_feature=1,
...     num_historical_features=5,
...     forecast_horizon=24,
...     lookback_window_size=96,
...     n_quantiles=9,
... )
>>> model = MLPGAMFPQRModel(model_config=config)

load_checkpoint()#

Load model from a checkpoint.

Parameters:: checkpoints_path – Path to the checkpoint file.
Return type:: None

update(trial)#

Update the model with new hyperparameters.

Parameters:: trial (Trial) – Optuna trial object containing hyperparameters.
Return type:: None

class twiga.models.nn.mlpgamcrc_model.MLPGAMCRCConfig(**data)#

Bases: MLPGAMConfig

Configuration for MLPGAM + Conditional Residual Calibration (additive-preserving).

Pairs the MLPGAM additive backbone with a CRC uncertainty head. The backbone’s additive mean is used directly (no projection) to preserve the GAM decomposition; the scale MLP is conditioned on the detached mean.

Variables:

name – Fixed model identifier.
two_stage – When True (default), uses two separate optimizers — one for the backbone (mu path) and one for the sigma MLP — implementing the paper’s frozen-backbone Stage 2. Set False for joint single-optimizer training.
search_space – Optuna hyperparameter search space.

model_config: ClassVar[ConfigDict] = {'extra': 'allow'}#: Configuration for the model, should be a dictionary conforming to [ConfigDict][pydantic.config.ConfigDict].

name: Literal['mlpgamcrc']#

sigma_loss_fn: Literal['gaussian', 'laplace', 'mse', 'hybrid', 'hybrid_sqrt', 'log']#

stage1_epochs: int | None#

two_stage: bool#

class twiga.models.nn.mlpgamcrc_model.MLPGAMCRCModel(model_config)#

Bases: BaseNeuralForecast

Training lifecycle wrapper for MLPGAM + CRC.

Example

>>> config = MLPGAMCRCConfig(
...     num_target_feature=1,
...     num_historical_features=5,
...     forecast_horizon=24,
...     lookback_window_size=96,
... )
>>> model = MLPGAMCRCModel(model_config=config)

load_checkpoint()#

Load model from a checkpoint.

Parameters:: checkpoints_path – Path to the checkpoint file.
Return type:: None

update(trial)#

Update the model with new hyperparameters.

Parameters:: trial (Trial) – Optuna trial object containing hyperparameters.
Return type:: None

MLPGAF — point model#

class twiga.models.nn.mlpgaf_model.MLPGAFConfig(**data)#

Bases: BaseMLPConfig

Configuration for the MLPGAF (Gated Attention Fusion) forecasting model.

Extends the shared MLP base configuration with lightweight channel-wise gating mechanisms and associated regularization terms. The gating allows the model to softly select or suppress feature channels in a learnable way.

Default values and search space ranges reflect ablation study findings:

value_embed_type=’ConvEmb’ consistently performed best.
Disabling RevIN (use_revin=False) provided the largest single improvement.
Near-zero or no gate penalty (lambda_gate ≈ 1e-6 or disabled) outperformed stronger regularization.
Short/no warmup (warmup_epochs ≤ 5) was clearly superior to longer schedules.
sigmoid gating performed marginally better than alternatives.

Variables:

name – Fixed model identifier (“mlpgaf”).
use_revin – Whether to apply Reversible Instance Normalization to input series.
lambda_weight – L1 regularization coefficient applied to encoder weights.
lambda_gate – Sparsity-inducing L1 penalty on the gating mechanism.
delta – Small constant added for numerical stability in the gate penalty term.
gate_scale – Scaling factor (temperature-like) applied to raw gate logits.
gate_type – Nonlinearity used for the gating function.
warmup_epochs – Number of epochs during which gate penalty is linearly ramped up.
search_space – Hyperparameter ranges used for Optuna-based tuning. Extends the base MLP search space with gating-related parameters.

delta: float#

distribution: Literal['normal', 'laplace', 'lognormal', 'gamma', 'beta', 'studentt', 'qr', 'fpqr', 'crc'] | None#

gate_scale: float#

gate_type: Literal['sigmoid']#

lambda_gate: float#

lambda_weight: float#

model_config: ClassVar[ConfigDict] = {'extra': 'allow'}#: Configuration for the model, should be a dictionary conforming to [ConfigDict][pydantic.config.ConfigDict].

name: Literal['mlpgaf']#

search_space: BaseSearchSpace#

use_revin: bool#

warmup_epochs: int#

class twiga.models.nn.mlpgaf_model.MLPGAFModel(model_config)#

Bases: BaseNeuralForecast

Training lifecycle wrapper for MLPGAFNetwork.

Instantiates MLPGAF from an MLPGAFConfig, manages training via PyTorch Lightning, and supports Optuna hyperparameter optimisation.

Example

>>> config = MLPGAFConfig(
...     num_target_feature=1,
...     num_historical_features=5,
...     forecast_horizon=24,
...     lookback_window_size=96,
... )
>>> model = MLPGAFModel(model_config=config)

__init__(model_config)#

Initialise with a validated MLPGAFConfig.

Parameters:: model_config (MLPGAFConfig) – Full model configuration.

load_checkpoint()#

Load the latest checkpoint and set model to eval mode.

Return type:: None

update(trial)#

Re-initialise model with Optuna-suggested hyperparameters.

Parameters:: trial (Trial) – Optuna trial providing parameter suggestions via the search space defined in MLPGAFConfig.search_space.
Return type:: None

MLPGAF — parametric variants#

class twiga.models.nn.mlpgafnormal_model.MLPGAFNormalConfig(**data)#

Bases: MLPGAFConfig

Configuration for Normal-distribution probabilistic MLPGAF.

Best for: symmetric, unbounded targets - energy demand, temperature.

Variables:

name – Fixed model identifier.
search_space – Optuna hyperparameter search space.

model_config: ClassVar[ConfigDict] = {'extra': 'allow'}#: Configuration for the model, should be a dictionary conforming to [ConfigDict][pydantic.config.ConfigDict].

name: Literal['mlpgafnormal']#

search_space: BaseSearchSpace#

class twiga.models.nn.mlpgafnormal_model.MLPGAFNormalModel(model_config)#

Bases: BaseNeuralForecast

Training lifecycle wrapper for Normal-distribution probabilistic MLPGAF.

Example

>>> config = MLPGAFNormalConfig(
...     num_target_feature=1,
...     num_historical_features=5,
...     forecast_horizon=24,
...     lookback_window_size=96,
... )
>>> model = MLPGAFNormalModel(model_config=config)

load_checkpoint()#

Load model from a checkpoint.

Parameters:: checkpoints_path – Path to the checkpoint file.
Return type:: None

update(trial)#

Update the model with new hyperparameters.

Parameters:: trial (Trial) – Optuna trial object containing hyperparameters.
Return type:: None

class twiga.models.nn.mlpgaflaplace_model.MLPGAFLaplaceConfig(**data)#

Bases: MLPGAFConfig

Configuration for Laplace-distribution probabilistic MLPGAF.

Best for: heavy-tailed residuals robust to outliers - electricity prices, wind speed residuals.

Variables:

name – Fixed model identifier.
search_space – Optuna hyperparameter search space.

model_config: ClassVar[ConfigDict] = {'extra': 'allow'}#: Configuration for the model, should be a dictionary conforming to [ConfigDict][pydantic.config.ConfigDict].

name: Literal['mlpgaflaplace']#

search_space: BaseSearchSpace#

class twiga.models.nn.mlpgaflaplace_model.MLPGAFLaplaceModel(model_config)#

Bases: BaseNeuralForecast

Training lifecycle wrapper for Laplace-distribution probabilistic MLPGAF.

Example

>>> config = MLPGAFLaplaceConfig(
...     num_target_feature=1,
...     num_historical_features=5,
...     forecast_horizon=24,
...     lookback_window_size=96,
... )
>>> model = MLPGAFLaplaceModel(model_config=config)

load_checkpoint()#

Load model from a checkpoint.

Parameters:: checkpoints_path – Path to the checkpoint file.
Return type:: None

update(trial)#

Update the model with new hyperparameters.

Parameters:: trial (Trial) – Optuna trial object containing hyperparameters.
Return type:: None

class twiga.models.nn.mlpgaflognormal_model.MLPGAFLogNormalConfig(**data)#

Bases: MLPGAFConfig

Configuration for LogNormal-distribution probabilistic MLPGAF.

Best for: strictly positive, right-skewed targets - gas prices, generation ramp-up events.

Variables:

name – Fixed model identifier.
search_space – Optuna hyperparameter search space.

model_config: ClassVar[ConfigDict] = {'extra': 'allow'}#: Configuration for the model, should be a dictionary conforming to [ConfigDict][pydantic.config.ConfigDict].

name: Literal['mlpgaflognormal']#

search_space: BaseSearchSpace#

class twiga.models.nn.mlpgaflognormal_model.MLPGAFLogNormalModel(model_config)#

Bases: BaseNeuralForecast

Training lifecycle wrapper for LogNormal-distribution probabilistic MLPGAF.

Example

>>> config = MLPGAFLogNormalConfig(
...     num_target_feature=1,
...     num_historical_features=5,
...     forecast_horizon=24,
...     lookback_window_size=96,
... )
>>> model = MLPGAFLogNormalModel(model_config=config)

load_checkpoint()#

Load model from a checkpoint.

Parameters:: checkpoints_path – Path to the checkpoint file.
Return type:: None

update(trial)#

Update the model with new hyperparameters.

Parameters:: trial (Trial) – Optuna trial object containing hyperparameters.
Return type:: None

class twiga.models.nn.mlpgafgamma_model.MLPGAFGammaConfig(**data)#

Bases: MLPGAFConfig

Configuration for Gamma-distribution probabilistic MLPGAF.

Best for: strictly positive targets with flexible skew - solar irradiance, wind power, aggregate load.

Note

Targets must be strictly positive. GammaDistribution constrains both shape parameters internally via softplus; out_activation_function is fixed to Identity and excluded from tuning.

Variables:

name – Fixed model identifier.
out_activation_function – Fixed to Identity (not used by Gamma head).
search_space – Optuna hyperparameter search space.

model_config: ClassVar[ConfigDict] = {'extra': 'allow'}#: Configuration for the model, should be a dictionary conforming to [ConfigDict][pydantic.config.ConfigDict].

name: Literal['mlpgafgamma']#

out_activation_function: Literal['Identity']#

search_space: BaseSearchSpace#

class twiga.models.nn.mlpgafgamma_model.MLPGAFGammaModel(model_config)#

Bases: BaseNeuralForecast

Training lifecycle wrapper for Gamma-distribution probabilistic MLPGAF.

Example

>>> config = MLPGAFGammaConfig(
...     num_target_feature=1,
...     num_historical_features=5,
...     forecast_horizon=24,
...     lookback_window_size=96,
... )
>>> model = MLPGAFGammaModel(model_config=config)

load_checkpoint()#

Load model from a checkpoint.

Parameters:: checkpoints_path – Path to the checkpoint file.
Return type:: None

update(trial)#

Update the model with new hyperparameters.

Parameters:: trial (Trial) – Optuna trial object containing hyperparameters.
Return type:: None

class twiga.models.nn.mlpgafbeta_model.MLPGAFBetaConfig(**data)#

Bases: MLPGAFConfig

Configuration for Beta-distribution probabilistic MLPGAF.

Best for: strictly bounded [0, 1] targets - capacity factors, state of charge, normalised demand ratios.

Note

Targets must lie strictly in (0, 1). Apply target = target.clamp(1e-6, 1 - 1e-6) in the data pipeline if boundary values are possible.

Variables:

name – Fixed model identifier.
out_activation_function – Fixed to Identity (not used by Beta head).
search_space – Optuna hyperparameter search space.

model_config: ClassVar[ConfigDict] = {'extra': 'allow'}#: Configuration for the model, should be a dictionary conforming to [ConfigDict][pydantic.config.ConfigDict].

name: Literal['mlpgafbeta']#

out_activation_function: Literal['Identity']#

search_space: BaseSearchSpace#

class twiga.models.nn.mlpgafbeta_model.MLPGAFBetaModel(model_config)#

Bases: BaseNeuralForecast

Training lifecycle wrapper for Beta-distribution probabilistic MLPGAF.

Example

>>> config = MLPGAFBetaConfig(
...     num_target_feature=1,
...     num_historical_features=5,
...     forecast_horizon=24,
...     lookback_window_size=96,
... )
>>> model = MLPGAFBetaModel(model_config=config)

load_checkpoint()#

Load model from a checkpoint.

Parameters:: checkpoints_path – Path to the checkpoint file.
Return type:: None

update(trial)#

Update the model with new hyperparameters.

Parameters:: trial (Trial) – Optuna trial object containing hyperparameters.
Return type:: None

class twiga.models.nn.mlpgafstudentt_model.MLPGAFStudentTConfig(**data)#

Bases: MLPGAFConfig

Configuration for Student-T distribution probabilistic MLPGAF.

Best for: very heavy-tailed targets - spot electricity prices, financial returns.

Variables:

name – Fixed model identifier.
min_df – Minimum degrees of freedom for the Student-T distribution. Values close to 2 yield very heavy tails; higher values approach Normal.
search_space – Optuna hyperparameter search space.

min_df: float#

model_config: ClassVar[ConfigDict] = {'extra': 'allow'}#: Configuration for the model, should be a dictionary conforming to [ConfigDict][pydantic.config.ConfigDict].

name: Literal['mlpgafstudentt']#

search_space: BaseSearchSpace#

class twiga.models.nn.mlpgafstudentt_model.MLPGAFStudentTModel(model_config)#

Bases: BaseNeuralForecast

Training lifecycle wrapper for Student-T distribution probabilistic MLPGAF.

Example

>>> config = MLPGAFStudentTConfig(
...     num_target_feature=1,
...     num_historical_features=5,
...     forecast_horizon=24,
...     lookback_window_size=96,
... )
>>> model = MLPGAFStudentTModel(model_config=config)

load_checkpoint()#

Load model from a checkpoint.

Parameters:: checkpoints_path – Path to the checkpoint file.
Return type:: None

update(trial)#

Update the model with new hyperparameters.

Parameters:: trial (Trial) – Optuna trial object containing hyperparameters.
Return type:: None

MLPGAF — quantile and CRC variants#

class twiga.models.nn.mlpgafqr_model.MLPGAFQRConfig(**data)#

Bases: MLPGAFConfig

Configuration for the MLPGAF fixed-grid quantile forecasting model.

Extends MLPGAFConfig with quantile regression parameters. The GAF encoder architecture (including gating params) is fully inherited and tunable.

Variables:

name – Fixed model identifier.
quantiles – Explicit quantile levels to forecast. If None, defaults inside QRDistribution apply.
conf_level – Confidence level for symmetric interval construction.
loss_fn – Pinball or Huber-pinball quantile loss.
kappa – Huber transition parameter (only relevant for ‘huber-pinball’).
eps – Numerical stability epsilon.
search_space – Optuna hyperparameter search space.

conf_level: float#

crossing_penalty: float#

eps: float#

gradient_clip_val: float | None#

kappa: float#

loss_fn: Literal['pinball', 'huber-pinball']#

model_config: ClassVar[ConfigDict] = {'extra': 'allow'}#: Configuration for the model, should be a dictionary conforming to [ConfigDict][pydantic.config.ConfigDict].

name: Literal['mlpgafqr']#

quantiles: list[float] | None#

search_space: BaseSearchSpace#

class twiga.models.nn.mlpgafqr_model.MLPGAFQRModel(model_config)#

Bases: BaseNeuralForecast

Training lifecycle wrapper for fixed-grid quantile regression (MLPGAFQR).

Instantiates MLPGAFQR from an MLPGAFQRConfig, manages training via PyTorch Lightning, and supports Optuna hyperparameter optimisation.

Example

>>> config = MLPGAFQRConfig(
...     num_target_feature=1,
...     num_historical_features=5,
...     forecast_horizon=24,
...     lookback_window_size=96,
...     quantiles=[0.1, 0.5, 0.9],
... )
>>> model = MLPGAFQRModel(model_config=config)

load_checkpoint()#

Load model from a checkpoint.

Parameters:: checkpoints_path – Path to the checkpoint file.
Return type:: None

update(trial)#

Update the model with new hyperparameters.

Parameters:: trial (Trial) – Optuna trial object containing hyperparameters.
Return type:: None

class twiga.models.nn.mlpgaffpqr_model.MLPGAFFPQRConfig(**data)#

Bases: MLPGAFConfig

Configuration for the MLPGAF flexible quantile proposal forecasting model.

Extends MLPGAFConfig with FPQR-specific parameters. Unlike fixed-grid quantile regression, this model adaptively proposes its own quantile grid during training via FPQRDistribution. The GAF backbone’s learned feature gating is fully inherited and tunable.

Variables:

name – Fixed model identifier.
n_quantiles – Number of quantile levels for the proposal network.
conf_level – Confidence level for symmetric interval construction.
loss_fn – Pinball or Huber-pinball quantile loss.
kappa – Huber transition parameter (only relevant for ‘huber-pinball’).
num_cosines – Cosine basis functions for the quantile embedding layer.
search_space – Optuna hyperparameter search space.

conf_level: float#

gradient_clip_val: float | None#

kappa: float#

loss_fn: Literal['pinball', 'huber-pinball']#

model_config: ClassVar[ConfigDict] = {'extra': 'allow'}#: Configuration for the model, should be a dictionary conforming to [ConfigDict][pydantic.config.ConfigDict].

n_quantiles: int#

name: Literal['mlpgaffpqr']#

num_cosines: int#

search_space: BaseSearchSpace#

class twiga.models.nn.mlpgaffpqr_model.MLPGAFFPQRModel(model_config)#

Bases: BaseNeuralForecast

Training lifecycle wrapper for flexible quantile proposal regression (MLPGAFFPQR).

Instantiates MLPGAFFPQR from an MLPGAFFPQRConfig, manages training via PyTorch Lightning, and supports Optuna hyperparameter optimisation.

Example

>>> config = MLPGAFFPQRConfig(
...     num_target_feature=1,
...     num_historical_features=5,
...     forecast_horizon=24,
...     lookback_window_size=96,
...     n_quantiles=9,
... )
>>> model = MLPGAFFPQRModel(model_config=config)

load_checkpoint()#

Load model from a checkpoint.

Parameters:: checkpoints_path – Path to the checkpoint file.
Return type:: None

update(trial)#

Update the model with new hyperparameters.

Parameters:: trial (Trial) – Optuna trial object containing hyperparameters.
Return type:: None

class twiga.models.nn.mlpgafcrc_model.MLPGAFCRCConfig(**data)#

Bases: MLPGAFConfig

Configuration for MLPGAF + Conditional Residual Calibration (additive-preserving).

Pairs the MLPGAF gated-attention-fusion backbone with a CRC uncertainty head. The backbone’s additive mean is used directly (no projection) to preserve the GAF decomposition; the scale MLP is conditioned on the detached mean.

Variables:

name – Fixed model identifier.
search_space – Optuna hyperparameter search space.

model_config: ClassVar[ConfigDict] = {'extra': 'allow'}#: Configuration for the model, should be a dictionary conforming to [ConfigDict][pydantic.config.ConfigDict].

name: Literal['mlpgafcrc']#

search_space: BaseSearchSpace#

sigma_loss_fn: Literal['gaussian', 'laplace', 'mse', 'hybrid', 'hybrid_sqrt', 'log']#

stage1_epochs: int | None#

two_stage: bool#

class twiga.models.nn.mlpgafcrc_model.MLPGAFCRCModel(model_config)#

Bases: BaseNeuralForecast

Training lifecycle wrapper for MLPGAF + CRC.

Example

>>> config = MLPGAFCRCConfig(
...     num_target_feature=1,
...     num_historical_features=5,
...     forecast_horizon=24,
...     lookback_window_size=96,
... )
>>> model = MLPGAFCRCModel(model_config=config)

load_checkpoint()#

Load model from a checkpoint.

Parameters:: checkpoints_path – Path to the checkpoint file.
Return type:: None

update(trial)#

Update the model with new hyperparameters.

Parameters:: trial (Trial) – Optuna trial object containing hyperparameters.
Return type:: None

N-HiTS — point model#

class twiga.models.nn.nhits_model.NHITSConfig(**data)#

Bases: BaseMLPConfig

Configuration for NHITS forecasting models.

Extends BaseMLPConfig with parameters specific to the NHITS (Neural Hierarchical Interpolation for Time Series) model, which uses stacked blocks for hierarchical forecasting. Configures model architecture, training, and hyperparameter tuning.

The three dimension fields (num_target_feature, forecast_horizon, lookback_window_size) default to 0 and are auto-populated from DataPipelineConfig when the config is passed to TwigaForecaster.

Parameters:

name (Literal["nhits"], optional) – Model type identifier. Fixed to “nhits”. Defaults to “nhits”.
num_target_feature (int, optional) – Number of target features to predict. Defaults to 0 (auto-populated by TwigaForecaster).
num_historical_features (int, optional) – Number of historical features. Defaults to 0.
num_calendar_features (int, optional) – Number of calendar features. Defaults to 0.
num_exogenous_features (int, optional) – Number of exogenous features. Defaults to 0.
num_future_covariates (int, optional) – Number of future covariates. Defaults to 0.
forecast_horizon (int, optional) – Number of time steps to forecast. Defaults to 0 (auto-populated by TwigaForecaster).
lookback_window_size (int, optional) – Size of the input lookback window. Defaults to 0 (auto-populated by TwigaForecaster).
stack_types (list[Literal["identity"]], optional) – Types of blocks in each stack (e.g., “identity” for standard forecasting). Defaults to [“identity”, “identity”, “identity”].
n_blocks (list[int], optional) – Number of blocks per stack. Defaults to [1, 1, 1].
mlp_units (list[list[int]], optional) – Number of units in MLP layers per block. Defaults to [[512, 512], [512, 512], [512, 512]].
n_pool_kernel_size (list[int], optional) – Pooling kernel sizes per stack. Defaults to [2, 2, 1].
n_freq_downsample (list[int], optional) – Frequency downsampling factors per stack. Defaults to [4, 2, 1].
pooling_mode (Literal["MaxPool1d", "AvgPool1d"], optional) – Pooling mode for stacks. Defaults to “MaxPool1d”.
interpolation_mode (Literal["linear", "nearest", "cubic"], optional) – Interpolation mode for forecasting. Defaults to “linear”.
decompose_forecast (bool, optional) – Decompose forecast into basis components. Defaults to False.
activation_function (Literal[...], optional) – Activation function for hidden layers. Defaults to “ReLU”.
out_activation_function (str, optional) – Activation function for output layer. Defaults to “Identity”.
dropout (float, optional) – Dropout rate for regularization (0 to 1). Defaults to 0.25.
alpha (float, optional) – Alpha parameter for the loss function (0 to 1). Defaults to 0.1.
search_space (BaseSearchSpace, optional) – Hyperparameter search space for tuning. Defaults to a predefined space.

Notes

Inherits fields from NeuralModelConfig (e.g., optimizer_params, max_epochs). The name field is fixed and excluded from tuning. NHITS-specific parameters (e.g., n_blocks, mlp_units) configure the stacked block architecture.

activation_function: Literal['ReLU', 'Softplus', 'Tanh', 'Sigmoid', 'SiLU', 'GELU', 'ELU', 'SELU', 'LeakyReLU', 'PReLU']#

alpha: float#

decompose_forecast: bool#

distribution: str | None#

dropout: float#

encode_size: int#

interpolation_mode: Literal['linear', 'nearest', 'cubic']#

mlp_units: list[list[int]]#

model_config: ClassVar[ConfigDict] = {'extra': 'allow'}#: Configuration for the model, should be a dictionary conforming to [ConfigDict][pydantic.config.ConfigDict].

n_blocks: list[int] | None#

n_freq_downsample: list[int] | None#

n_pool_kernel_size: list[int] | None#

name: Literal['nhits']#

out_activation_function: str#

pooling_mode: Literal['MaxPool1d', 'AvgPool1d']#

search_space: BaseSearchSpace#

stack_types: list[Literal['identity']]#

class twiga.models.nn.nhits_model.NHITSModel(model_config=None)#

Bases: BaseNeuralForecast

MLP Forecast Model for time series point forecasting using configurable architecture.

This model implements a multilayer perceptron-based forecaster with attention mechanisms and configurable feature processing. Model architecture and training parameters are controlled through the MLPFConfig configuration class.

Variables:

model_config (MLPFConfig) – Complete configuration object containing model architecture and training parameters. See MLPFConfig documentation for full details.
model (MLPFNetwork) – Instantiated neural network implementing the forecast logic.
metric (str) – Primary evaluation metric used for model selection (inherited from BaseNeuralModel).

Example

>>> from twiga.forecasting.config import MLPFConfig
>>> data_config = DataPipelineConfig(
...     target_feature=["energy_demand"],
...     historical_features=["temp", "humidity"],
...     forecast_horizon=24,
...     lookback_window_size=168,
... )
>>> model_config = MLPFConfig.from_data_config(data_config)
>>> model = MLPFModel(model_config=model_config)
>>> model.update(trial=optuna_trial)  # Hyperparameter update

__init__(model_config=None)#

Initialize MLP forecast model with provided configuration.

Parameters:: model_config (NHITSConfig | None) – Configuration object containing: - num_target_feature: Number of target series to forecast - num_historical_features: Historical feature dimensions - num_calendar_features: Calendar feature dimensions - num_exogenous_features: Exogenous feature dimensions - num_future_covariates: Future covariate dimensions - forecast_horizon: Prediction steps - lookback_window_size: Historical window size - hidden_size: Hidden layer dimensionality - num_layers: Number of MLP layers - activation_function: Hidden layer activation - dropout: Dropout probability - max_epochs: Training iterations - [Full parameter list in MLPFConfig docs]

load_checkpoint()#

Load the latest checkpoint for the model.

This method retrieves the path of the latest checkpoint and loads the model state into the current instance.

update(trial)#

Update model hyperparameters using Optuna trial suggestions.

Parameters:: trial (Trial) – Optuna optimization trial providing hyperparameter sampling and pruning capabilities. Interacts with the search space defined in MLPFConfig.search_space.
Return type:: None

Updates:: Reinitializes the neural network with trial-suggested parameters while preserving configuration structure and validation rules.

N-HiTS — parametric variants#

class twiga.models.nn.nhitsnormal_model.NHITSNormalConfig(**data)#

Bases: NHITSConfig

Configuration for Normal-distribution probabilistic NHiTS.

Best for: symmetric, unbounded targets - energy demand, temperature.

model_config: ClassVar[ConfigDict] = {'extra': 'allow'}#: Configuration for the model, should be a dictionary conforming to [ConfigDict][pydantic.config.ConfigDict].

name: Literal['nhitsnormal']#

search_space: BaseSearchSpace#

class twiga.models.nn.nhitsnormal_model.NHITSNormalModel(model_config)#

Bases: BaseNeuralForecast

Training lifecycle wrapper for Normal-distribution probabilistic NHiTS.

load_checkpoint()#

Load model from a checkpoint.

Parameters:: checkpoints_path – Path to the checkpoint file.
Return type:: None

update(trial)#

Update the model with new hyperparameters.

Parameters:: trial (Trial) – Optuna trial object containing hyperparameters.
Return type:: None

class twiga.models.nn.nhitslaplace_model.NHITSLaplaceConfig(**data)#

Bases: NHITSConfig

Configuration for Laplace-distribution probabilistic NHiTS.

Best for: heavy-tailed, outlier-robust targets - electricity prices.

model_config: ClassVar[ConfigDict] = {'extra': 'allow'}#: Configuration for the model, should be a dictionary conforming to [ConfigDict][pydantic.config.ConfigDict].

name: Literal['nhitslaplace']#

search_space: BaseSearchSpace#

class twiga.models.nn.nhitslaplace_model.NHITSLaplaceModel(model_config)#

Bases: BaseNeuralForecast

Training lifecycle wrapper for Laplace-distribution probabilistic NHiTS.

load_checkpoint()#

Load model from a checkpoint.

Parameters:: checkpoints_path – Path to the checkpoint file.
Return type:: None

update(trial)#

Update the model with new hyperparameters.

Parameters:: trial (Trial) – Optuna trial object containing hyperparameters.
Return type:: None

class twiga.models.nn.nhitslognormal_model.NHITSLogNormalConfig(**data)#

Bases: NHITSConfig

Configuration for LogNormal-distribution probabilistic NHiTS.

Best for: strictly positive, right-skewed targets - gas prices.

model_config: ClassVar[ConfigDict] = {'extra': 'allow'}#: Configuration for the model, should be a dictionary conforming to [ConfigDict][pydantic.config.ConfigDict].

name: Literal['nhitslognormal']#

search_space: BaseSearchSpace#

class twiga.models.nn.nhitslognormal_model.NHITSLogNormalModel(model_config)#

Bases: BaseNeuralForecast

Training lifecycle wrapper for LogNormal-distribution probabilistic NHiTS.

load_checkpoint()#

Load model from a checkpoint.

Parameters:: checkpoints_path – Path to the checkpoint file.
Return type:: None

update(trial)#

Update the model with new hyperparameters.

Parameters:: trial (Trial) – Optuna trial object containing hyperparameters.
Return type:: None

class twiga.models.nn.nhitsgamma_model.NHITSGammaConfig(**data)#

Bases: NHITSConfig

Configuration for Gamma-distribution probabilistic NHiTS.

Best for: strictly positive, flexible-skew targets - solar, wind generation.

model_config: ClassVar[ConfigDict] = {'extra': 'allow'}#: Configuration for the model, should be a dictionary conforming to [ConfigDict][pydantic.config.ConfigDict].

name: Literal['nhitsgamma']#

search_space: BaseSearchSpace#

class twiga.models.nn.nhitsgamma_model.NHITSGammaModel(model_config)#

Bases: BaseNeuralForecast

Training lifecycle wrapper for Gamma-distribution probabilistic NHiTS.

load_checkpoint()#

Load model from a checkpoint.

Parameters:: checkpoints_path – Path to the checkpoint file.
Return type:: None

update(trial)#

Update the model with new hyperparameters.

Parameters:: trial (Trial) – Optuna trial object containing hyperparameters.
Return type:: None

class twiga.models.nn.nhitsbeta_model.NHITSBetaConfig(**data)#

Bases: NHITSConfig

Configuration for Beta-distribution probabilistic NHiTS.

Best for: bounded [0, 1] targets - capacity factors, state of charge.

model_config: ClassVar[ConfigDict] = {'extra': 'allow'}#: Configuration for the model, should be a dictionary conforming to [ConfigDict][pydantic.config.ConfigDict].

name: Literal['nhitsbeta']#

search_space: BaseSearchSpace#

class twiga.models.nn.nhitsbeta_model.NHITSBetaModel(model_config)#

Bases: BaseNeuralForecast

Training lifecycle wrapper for Beta-distribution probabilistic NHiTS.

load_checkpoint()#

Load model from a checkpoint.

Parameters:: checkpoints_path – Path to the checkpoint file.
Return type:: None

update(trial)#

Update the model with new hyperparameters.

Parameters:: trial (Trial) – Optuna trial object containing hyperparameters.
Return type:: None

class twiga.models.nn.nhitsstudentt_model.NHITSStudentTConfig(**data)#

Bases: NHITSConfig

Configuration for Student-T-distribution probabilistic NHiTS.

Best for: very heavy tails - spot prices, financial returns.

min_df: float#

model_config: ClassVar[ConfigDict] = {'extra': 'allow'}#: Configuration for the model, should be a dictionary conforming to [ConfigDict][pydantic.config.ConfigDict].

name: Literal['nhitsstudentt']#

search_space: BaseSearchSpace#

class twiga.models.nn.nhitsstudentt_model.NHITSStudentTModel(model_config)#

Bases: BaseNeuralForecast

Training lifecycle wrapper for Student-T-distribution probabilistic NHiTS.

load_checkpoint()#

Load model from a checkpoint.

Parameters:: checkpoints_path – Path to the checkpoint file.
Return type:: None

update(trial)#

Update the model with new hyperparameters.

Parameters:: trial (Trial) – Optuna trial object containing hyperparameters.
Return type:: None

N-HiTS — quantile and CRC variants#

class twiga.models.nn.nhitsqr_model.NHITSQRConfig(**data)#

Bases: NHITSConfig

Configuration for NHiTS fixed-grid quantile regression.

Extends NHITSConfig with quantile regression parameters.

Variables:

name – Fixed model identifier.
quantiles – Explicit quantile levels. If None, defaults inside QRDistribution apply.
conf_level – Confidence level for interval construction.
loss_fn – Pinball or Huber-pinball quantile loss.
kappa – Huber transition parameter.
eps – Numerical stability epsilon.

conf_level: float#

crossing_penalty: float#

eps: float#

gradient_clip_val: float | None#

kappa: float#

loss_fn: Literal['pinball', 'huber-pinball']#

model_config: ClassVar[ConfigDict] = {'extra': 'allow'}#: Configuration for the model, should be a dictionary conforming to [ConfigDict][pydantic.config.ConfigDict].

name: Literal['nhitsqr']#

quantiles: list[float] | None#

search_space: BaseSearchSpace#

class twiga.models.nn.nhitsqr_model.NHITSQRModel(model_config)#

Bases: BaseNeuralForecast

Training lifecycle wrapper for NHiTS fixed-grid quantile regression.

load_checkpoint()#

Load model from a checkpoint.

Parameters:: checkpoints_path – Path to the checkpoint file.
Return type:: None

update(trial)#

Update the model with new hyperparameters.

Parameters:: trial (Trial) – Optuna trial object containing hyperparameters.
Return type:: None

class twiga.models.nn.nhitsqr_model.NHITSFPQRConfig(**data)#

Bases: NHITSConfig

Configuration for NHiTS flexible-proposal quantile regression.

Extends NHITSConfig with FPQR-specific parameters.

Variables:

name – Fixed model identifier.
n_quantiles – Number of quantile proposals.
conf_level – Confidence level for interval construction.
loss_fn – Pinball or Huber-pinball quantile loss.
kappa – Huber transition parameter.
num_cosines – Cosine features for the quantile embedding layer.

conf_level: float#

gradient_clip_val: float | None#

kappa: float#

loss_fn: Literal['pinball', 'huber-pinball']#

model_config: ClassVar[ConfigDict] = {'extra': 'allow'}#: Configuration for the model, should be a dictionary conforming to [ConfigDict][pydantic.config.ConfigDict].

n_quantiles: int#

name: Literal['nhitsfpqr']#

num_cosines: int#

search_space: BaseSearchSpace#

class twiga.models.nn.nhitsqr_model.NHITSFPQRModel(model_config)#

Bases: BaseNeuralForecast

Training lifecycle wrapper for NHiTS flexible-proposal quantile regression.

load_checkpoint()#

Load model from a checkpoint.

Parameters:: checkpoints_path – Path to the checkpoint file.
Return type:: None

update(trial)#

Update the model with new hyperparameters.

Parameters:: trial (Trial) – Optuna trial object containing hyperparameters.
Return type:: None

class twiga.models.nn.nhitscrc_model.NHITSCRCConfig(**data)#

Bases: NHITSConfig

Configuration for NHiTS + Conditional Residual Calibration.

Pairs the NHiTS backbone with a CRC uncertainty head that approximates absolute residuals. Inherits all NHiTS architecture parameters.

Variables:

name – Fixed model identifier.
search_space – Optuna hyperparameter search space.

model_config: ClassVar[ConfigDict] = {'extra': 'allow'}#: Configuration for the model, should be a dictionary conforming to [ConfigDict][pydantic.config.ConfigDict].

name: Literal['nhitscrc']#

search_space: BaseSearchSpace#

sigma_loss_fn: Literal['gaussian', 'laplace', 'mse', 'hybrid', 'hybrid_sqrt', 'log']#

stage1_epochs: int | None#

two_stage: bool#

class twiga.models.nn.nhitscrc_model.NHITSCRCModel(model_config)#

Bases: BaseNeuralForecast

Training lifecycle wrapper for NHiTS + CRC.

Example

>>> config = NHITSCRCConfig(
...     num_target_feature=1,
...     num_historical_features=5,
...     forecast_horizon=24,
...     lookback_window_size=96,
... )
>>> model = NHITSCRCModel(model_config=config)

load_checkpoint()#

Load model from a checkpoint.

Parameters:: checkpoints_path – Path to the checkpoint file.
Return type:: None

update(trial)#

Update the model with new hyperparameters.

Parameters:: trial (Trial) – Optuna trial object containing hyperparameters.
Return type:: None

RNN — point model#

class twiga.models.nn.rnn_model.RNNConfig(**data)#

Bases: BaseMLPConfig

Configuration for the RNN forecasting model.

Extends BaseMLPConfig with RNN-specific parameters: hidden state width, number of stacked layers, cell type (GRU/LSTM), and bidirectionality.

The three sequence-dimension fields (num_target_feature, forecast_horizon, lookback_window_size) default to 0 and are auto-populated from DataPipelineConfig when passed to TwigaForecaster.

Variables:

name – Fixed model identifier "rnn".
hidden_size – Width of the RNN hidden state.
n_layers – Number of stacked RNN layers.
cell_type – RNN cell type — "gru" or "lstm".
bidirectional – Whether to use a bidirectional RNN.
search_space – Optuna hyperparameter search space.

Examples

>>> config = RNNConfig(
...     num_target_feature=1,
...     num_historical_features=5,
...     forecast_horizon=24,
...     lookback_window_size=96,
... )
>>> model = RNNModel(model_config=config)

bidirectional: bool#

cell_type: Literal['gru', 'lstm']#

hidden_size: int#

model_config: ClassVar[ConfigDict] = {'extra': 'allow'}#: Configuration for the model, should be a dictionary conforming to [ConfigDict][pydantic.config.ConfigDict].

n_layers: int#

name: Literal['rnn']#

search_space: BaseSearchSpace#

class twiga.models.nn.rnn_model.RNNModel(model_config)#

Bases: BaseNeuralForecast

Training lifecycle wrapper for RNNForecastNetwork.

Instantiates RNN from an RNNConfig, manages training via PyTorch Lightning, and supports Optuna hyperparameter optimisation.

Example

>>> config = RNNConfig(
...     num_target_feature=1,
...     num_historical_features=5,
...     forecast_horizon=24,
...     lookback_window_size=96,
... )
>>> model = RNNModel(model_config=config)

__init__(model_config)#

Initialise with a validated RNNConfig.

Parameters:: model_config (RNNConfig) – Full model configuration.

load_checkpoint()#

Load the latest checkpoint and set model to eval mode.

Return type:: None

update(trial)#

Re-initialise model with Optuna-suggested hyperparameters.

Parameters:: trial (Trial) – Optuna trial providing parameter suggestions via the search space defined in RNNConfig.search_space.
Return type:: None

RNN — parametric variants#

class twiga.models.nn.rnnnormal_model.RNNNormalConfig(**data)#

Bases: RNNConfig

Configuration for Normal-distribution probabilistic RNN.

Best for: symmetric, unbounded targets — energy demand, temperature.

Variables:

name – Fixed model identifier "rnnnormal".
search_space – Optuna hyperparameter search space.

model_config: ClassVar[ConfigDict] = {'extra': 'allow'}#: Configuration for the model, should be a dictionary conforming to [ConfigDict][pydantic.config.ConfigDict].

name: Literal['rnnnormal']#

search_space: BaseSearchSpace#

class twiga.models.nn.rnnnormal_model.RNNNormalModel(model_config)#

Bases: BaseNeuralForecast

Training lifecycle wrapper for Normal-distribution probabilistic RNN.

__init__(model_config)#

Initialise with a validated RNNNormalConfig.

Parameters:: model_config (RNNNormalConfig) – Full model configuration.

load_checkpoint()#

Load the latest checkpoint and set model to eval mode.

Return type:: None

update(trial)#

Re-initialise model with Optuna-suggested hyperparameters.

Parameters:: trial (Trial) – Optuna trial providing parameter suggestions.
Return type:: None

class twiga.models.nn.rnnlaplace_model.RNNLaplaceConfig(**data)#

Bases: RNNConfig

Configuration for Laplace-distribution probabilistic RNN.

Best for: heavy-tailed, outlier-robust forecasting — electricity prices.

Variables:

name – Fixed model identifier "rnnlaplace".
search_space – Optuna hyperparameter search space.

model_config: ClassVar[ConfigDict] = {'extra': 'allow'}#: Configuration for the model, should be a dictionary conforming to [ConfigDict][pydantic.config.ConfigDict].

name: Literal['rnnlaplace']#

search_space: BaseSearchSpace#

class twiga.models.nn.rnnlaplace_model.RNNLaplaceModel(model_config)#

Bases: BaseNeuralForecast

Training lifecycle wrapper for Laplace-distribution probabilistic RNN.

__init__(model_config)#

Initialise with a validated RNNLaplaceConfig.

Parameters:: model_config (RNNLaplaceConfig) – Full model configuration.

load_checkpoint()#

Load the latest checkpoint and set model to eval mode.

Return type:: None

update(trial)#

Re-initialise model with Optuna-suggested hyperparameters.

Parameters:: trial (Trial) – Optuna trial providing parameter suggestions.
Return type:: None

class twiga.models.nn.rnnlognormal_model.RNNLogNormalConfig(**data)#

Bases: RNNConfig

Configuration for LogNormal-distribution probabilistic RNN.

Best for: strictly positive, right-skewed targets — gas prices.

Variables:

name – Fixed model identifier "rnnlognormal".
search_space – Optuna hyperparameter search space.

model_config: ClassVar[ConfigDict] = {'extra': 'allow'}#: Configuration for the model, should be a dictionary conforming to [ConfigDict][pydantic.config.ConfigDict].

name: Literal['rnnlognormal']#

search_space: BaseSearchSpace#

class twiga.models.nn.rnnlognormal_model.RNNLogNormalModel(model_config)#

Bases: BaseNeuralForecast

Training lifecycle wrapper for LogNormal-distribution probabilistic RNN.

__init__(model_config)#

Initialise with a validated RNNLogNormalConfig.

Parameters:: model_config (RNNLogNormalConfig) – Full model configuration.

load_checkpoint()#

Load the latest checkpoint and set model to eval mode.

Return type:: None

update(trial)#

Re-initialise model with Optuna-suggested hyperparameters.

Parameters:: trial (Trial) – Optuna trial providing parameter suggestions.
Return type:: None

class twiga.models.nn.rnngamma_model.RNNGammaConfig(**data)#

Bases: RNNConfig

Configuration for Gamma-distribution probabilistic RNN.

Best for: strictly positive, flexible-skew targets — solar irradiance, wind.

Variables:

name – Fixed model identifier "rnngamma".
search_space – Optuna hyperparameter search space.

model_config: ClassVar[ConfigDict] = {'extra': 'allow'}#: Configuration for the model, should be a dictionary conforming to [ConfigDict][pydantic.config.ConfigDict].

name: Literal['rnngamma']#

search_space: BaseSearchSpace#

class twiga.models.nn.rnngamma_model.RNNGammaModel(model_config)#

Bases: BaseNeuralForecast

Training lifecycle wrapper for Gamma-distribution probabilistic RNN.

__init__(model_config)#

Initialise with a validated RNNGammaConfig.

Parameters:: model_config (RNNGammaConfig) – Full model configuration.

load_checkpoint()#

Load the latest checkpoint and set model to eval mode.

Return type:: None

update(trial)#

Re-initialise model with Optuna-suggested hyperparameters.

Parameters:: trial (Trial) – Optuna trial providing parameter suggestions.
Return type:: None

class twiga.models.nn.rnnbeta_model.RNNBetaConfig(**data)#

Bases: RNNConfig

Configuration for Beta-distribution probabilistic RNN.

Best for: bounded [0, 1] targets — capacity factors, state of charge.

Variables:

name – Fixed model identifier "rnnbeta".
search_space – Optuna hyperparameter search space.

model_config: ClassVar[ConfigDict] = {'extra': 'allow'}#: Configuration for the model, should be a dictionary conforming to [ConfigDict][pydantic.config.ConfigDict].

name: Literal['rnnbeta']#

search_space: BaseSearchSpace#

class twiga.models.nn.rnnbeta_model.RNNBetaModel(model_config)#

Bases: BaseNeuralForecast

Training lifecycle wrapper for Beta-distribution probabilistic RNN.

__init__(model_config)#

Initialise with a validated RNNBetaConfig.

Parameters:: model_config (RNNBetaConfig) – Full model configuration.

load_checkpoint()#

Load the latest checkpoint and set model to eval mode.

Return type:: None

update(trial)#

Re-initialise model with Optuna-suggested hyperparameters.

Parameters:: trial (Trial) – Optuna trial providing parameter suggestions.
Return type:: None

class twiga.models.nn.rnnstudentt_model.RNNStudentTConfig(**data)#

Bases: RNNConfig

Configuration for Student-T-distribution probabilistic RNN.

Best for: very heavy tails — spot prices, financial returns.

Variables:

name – Fixed model identifier "rnnstudentt".
search_space – Optuna hyperparameter search space.

model_config: ClassVar[ConfigDict] = {'extra': 'allow'}#: Configuration for the model, should be a dictionary conforming to [ConfigDict][pydantic.config.ConfigDict].

name: Literal['rnnstudentt']#

search_space: BaseSearchSpace#

class twiga.models.nn.rnnstudentt_model.RNNStudentTModel(model_config)#

Bases: BaseNeuralForecast

Training lifecycle wrapper for Student-T-distribution probabilistic RNN.

__init__(model_config)#

Initialise with a validated RNNStudentTConfig.

Parameters:: model_config (RNNStudentTConfig) – Full model configuration.

load_checkpoint()#

Load the latest checkpoint and set model to eval mode.

Return type:: None

update(trial)#

Re-initialise model with Optuna-suggested hyperparameters.

Parameters:: trial (Trial) – Optuna trial providing parameter suggestions.
Return type:: None

RNN — quantile variants#

class twiga.models.nn.rnnqr_model.RNNQRConfig(**data)#

Bases: RNNConfig

Configuration for RNN fixed-grid quantile regression.

Extends RNNConfig with quantile regression parameters.

Variables:

name – Fixed model identifier.
quantiles – Explicit quantile levels. If None, defaults inside QRDistribution apply.
conf_level – Confidence level for interval construction.
loss_fn – Pinball or Huber-pinball quantile loss.
kappa – Huber transition parameter.
eps – Numerical stability epsilon.

conf_level: float#

crossing_penalty: float#

eps: float#

gradient_clip_val: float | None#

kappa: float#

loss_fn: Literal['pinball', 'huber-pinball']#

model_config: ClassVar[ConfigDict] = {'extra': 'allow'}#: Configuration for the model, should be a dictionary conforming to [ConfigDict][pydantic.config.ConfigDict].

name: Literal['rnnqr']#

quantiles: list[float] | None#

search_space: BaseSearchSpace#

class twiga.models.nn.rnnqr_model.RNNQRModel(model_config)#

Bases: BaseNeuralForecast

Training lifecycle wrapper for RNN fixed-grid quantile regression.

__init__(model_config)#

Initialise with a validated RNNQRConfig.

Parameters:: model_config (RNNQRConfig) – Full model configuration.

load_checkpoint()#

Load the latest checkpoint and set model to eval mode.

Return type:: None

update(trial)#

Re-initialise model with Optuna-suggested hyperparameters.

Parameters:: trial (Trial) – Optuna trial providing parameter suggestions.
Return type:: None

class twiga.models.nn.rnnqr_model.RNNFPQRConfig(**data)#

Bases: RNNConfig

Configuration for RNN flexible-proposal quantile regression.

Extends RNNConfig with FPQR-specific parameters.

Variables:

name – Fixed model identifier.
n_quantiles – Number of quantile proposals.
conf_level – Confidence level for interval construction.
loss_fn – Pinball or Huber-pinball quantile loss.
kappa – Huber transition parameter.
num_cosines – Cosine features for the quantile embedding layer.

conf_level: float#

gradient_clip_val: float | None#

kappa: float#

loss_fn: Literal['pinball', 'huber-pinball']#

model_config: ClassVar[ConfigDict] = {'extra': 'allow'}#: Configuration for the model, should be a dictionary conforming to [ConfigDict][pydantic.config.ConfigDict].

n_quantiles: int#

name: Literal['rnnfpqr']#

num_cosines: int#

search_space: BaseSearchSpace#

class twiga.models.nn.rnnqr_model.RNNFPQRModel(model_config)#

Bases: BaseNeuralForecast

Training lifecycle wrapper for RNN flexible-proposal quantile regression.

__init__(model_config)#

Initialise with a validated RNNFPQRConfig.

Parameters:: model_config (RNNFPQRConfig) – Full model configuration.

load_checkpoint()#

Load the latest checkpoint and set model to eval mode.

Return type:: None

update(trial)#

Re-initialise model with Optuna-suggested hyperparameters.

Parameters:: trial (Trial) – Optuna trial providing parameter suggestions.
Return type:: None

Neural Network Models#

Base Class Hierarchy#

BaseNeuralForecast#

BaseNeuralModel#

BaseArchitecture#

Embeddings#

Linear Layers and Building Blocks#

Point Forecast Models#

MLPF - MLP Forecaster#

MLPGAM - MLP Generalized Additive Model#

MLPGAF - MLP with Gated Additive Features#

N-HiTS - Neural Hierarchical Interpolation for Time Series#

RNN - Recurrent Neural Network Forecaster#

Probabilistic Models#

How the backbone/head pattern works#

Available probabilistic variants#

MLPF-QR - MLPF with Quantile Regression#

MLPGAM-CRC - MLPGAM with Conditional Residual Calibration#

Model Comparison#

Usage#

Creating a Model via from_data_config#

Probabilistic Forecasting with MLPF-QR#

API Reference#

Base classes#

Probabilistic core#

MLPF — point model#

MLPF — parametric variants#

MLPF — quantile variants#

MLPGAM — point model#

MLPGAM — parametric variants#

MLPGAM — quantile and CRC variants#

MLPGAF — point model#

MLPGAF — parametric variants#

MLPGAF — quantile and CRC variants#

N-HiTS — point model#

N-HiTS — parametric variants#

N-HiTS — quantile and CRC variants#

RNN — point model#

RNN — parametric variants#

RNN — quantile variants#

Creating a Model via `from_data_config`#