Machine Learning Models#

Twiga’s ML domain provides gradient-boosted tree models and a linear baseline, all sharing a common scikit-learn-compatible interface defined by BaseRegressor. Each model is configured through a Pydantic config class that inherits from BaseModelConfig and includes a default hyperparameter search space for Optuna-based tuning.

For the full model catalogue (including neural network models) see the Model Catalog Overview.

BaseRegressor Interface#

BaseRegressor extends scikit-learn’s BaseEstimator and RegressorMixin, giving every ML model in Twiga a uniform API that the TwigaForecaster relies on.

Constructor#

BaseRegressor(data_pipeline: Any | None = None)

Attribute	Type	Description
`model`	`Any \| None`	The underlying regression estimator. Set by each subclass.
`data_pipeline`	`Any \| None`	Optional data preprocessing pipeline reference.
`num_targets`	`int \| None`	Number of target variables, determined during `fit()`.

Key Methods#

Method	Signature	Description
`format_features`	`format_features(x: np.ndarray) -> np.ndarray`	Flattens 3-D input `(batch, seq_len, features)` to 2-D `(batch, seq_len * features)`. Passthrough for arrays that are already 2-D.
`fit`	`fit(X: np.ndarray, y: np.ndarray, verbose: bool = False) -> BaseRegressor`	Formats features and targets to 2-D, then delegates to the underlying model’s `fit`. Sets `num_targets` from the target shape.
`predict`	`predict(x: np.ndarray) -> np.ndarray`	Accepts 3-D input, returns predictions reshaped to `(batch, seq_len, num_targets)`.
`forecast`	`forecast(x: np.ndarray) -> dict`	Calls `predict` and returns the result directly (point forecast models return the raw array; the forecaster wraps it with `"loc"` key handling).

Why flatten to 2-D?

Scikit-learn estimators expect 2-D (samples, features) input. format_features reshapes the 3-D sliding-window arrays produced by the DataPipeline into the flat format that tree-based models require, while preserving batch semantics for reshaping predictions back to 3-D.

API Reference#

class twiga.models.ml.core.base_regressor.BaseRegressor(data_pipeline=None)#

Bases: BaseEstimator, RegressorMixin

A base class for regression models compliant with scikit-learn pipelines.

Provides common functionality for feature formatting, model fitting, and forecasting.

Variables:

model (Any | None) – The regression model to be used. Must implement fit and predict methods.
data_pipeline (Any | None) – Optional pipeline to preprocess data.
num_targets (int | None) – The number of target variables.
feature_names (list[str] | None) – Column names captured at fit time (sklearn convention). Set when X is a DataFrame; otherwise synthetic names f0, f1, ... are used. Used at predict time to avoid sklearn feature-name mismatch warnings.

__init__(data_pipeline=None)#

Initialize the BaseRegressor with an optional data pipeline.

Parameters:: data_pipeline (Any | None) – Data pipeline for preprocessing (default is None).

Example

>>> reg = BaseRegressor()

fit(X, y, eval_set=None, verbose=False)#

Fit the regression model to the training data.

Parameters:

X (ndarray) – Training input features.
y (ndarray) – Target values corresponding to the training inputs.
eval_set (tuple[ndarray, ndarray] | None) – Optional (X_val, y_val) tuple for early stopping. Passed through to subclass implementations that support it.
verbose (bool) – Flag to control verbosity (default is False).

Return type:

BaseRegressor

Returns:

BaseRegressor – The instance itself.

Raises:

ValueError – If no model has been set prior to calling fit.

Example

>>> X = np.random.rand(10, 5, 3)
>>> y = np.random.rand(10, 5, 2)
>>> reg = BaseRegressor()
>>> reg.model = SomeModel()  # assign a model implementing fit/predict
>>> reg.fit(X, y, verbose=True)

forecast(x)#

Forecast output using the fitted regression model.

Parameters:: x (ndarray) – Input features for forecasting.
Return type:: ndarray | dict
Returns:: dict – A dictionary containing the predicted values with key “loc”.

Example

>>> x = np.random.rand(10, 5, 3)
>>> reg = BaseRegressor()
>>> reg.model = SomeModel()  # assign a model implementing predict
>>> forecast_output = reg.forecast(x)
>>> "loc" in forecast_output
True

format_features(x)#

Format the input features by flattening multi-dimensional arrays into 2D arrays.

Parameters:: x (ndarray | DataFrame) – Input features. If X has more than 2 dimensions, it is reshaped to have shape (samples, features).
Return type:: ndarray
Returns:: np.ndarray – The formatted 2D feature array.

Example

>>> x = np.random.rand(10, 5, 3)
>>> reg = BaseRegressor()
>>> x_formatted = reg.format_features(x)
>>> x_formatted.shape
(10, 15)

get_params(deep=True)#

Return the model parameters.

Parameters:: deep (bool) – If True, return parameters for this estimator and its subobjects.
Return type:: dict
Returns:: dict – A dictionary of parameters.

Example

>>> reg = BaseRegressor()
>>> params = reg.get_params()

predict(x)#

Predict output using the fitted regression model.

Parameters:: x (ndarray) – Input features for prediction. Expected to be 3D (samples, time_steps, channels).
Return type:: ndarray
Returns:: np.ndarray – Predicted values, reshaped to (samples, time_steps, num_targets).
Raises:: ValueError – If no model has been set or if the input array is not 3-dimensional.

Example

>>> x = np.random.rand(10, 5, 3)
>>> reg = BaseRegressor()
>>> reg.model = SomeModel()  # assign a model implementing predict
>>> predictions = reg.predict(X)
>>> predictions.shape  # Expected: (10, 5, num_targets)

set_fit_request(*, eval_set='$UNCHANGED$', verbose='$UNCHANGED$')#

Configure whether metadata should be requested to be passed to the fit method.

Note that this method is only relevant when this estimator is used as a sub-estimator within a meta-estimator and metadata routing is enabled with enable_metadata_routing=True (see sklearn.set_config()). Please check the User Guide on how the routing mechanism works.

The options for each parameter are:

True: metadata is requested, and passed to fit if provided. The request is ignored if metadata is not provided.
False: metadata is not requested and the meta-estimator will not pass it to fit.
None: metadata is not requested, and the meta-estimator will raise an error if the user provides it.
str: metadata should be passed to the meta-estimator with this given alias instead of the original name.

The default (sklearn.utils.metadata_routing.UNCHANGED) retains the existing request. This allows you to change the request for some parameters and not others.

Added in version 1.3.

Parameters#

eval_setstr, True, False, or None, default=sklearn.utils.metadata_routing.UNCHANGED: Metadata routing for eval_set parameter in fit.
verbosestr, True, False, or None, default=sklearn.utils.metadata_routing.UNCHANGED: Metadata routing for verbose parameter in fit.

Returns#

selfobject: The updated object.

set_params(**params)#

Set the model parameters.

Parameters:: **params (Any) – Parameters to set in the model.
Return type:: BaseRegressor
Returns:: BaseRegressor – The instance with updated parameters.

Example

>>> reg = BaseRegressor()
>>> reg = reg.set_params(model=SomeModel())

set_predict_request(*, x='$UNCHANGED$')#

Configure whether metadata should be requested to be passed to the predict method.

Note that this method is only relevant when this estimator is used as a sub-estimator within a meta-estimator and metadata routing is enabled with enable_metadata_routing=True (see sklearn.set_config()). Please check the User Guide on how the routing mechanism works.

The options for each parameter are:

True: metadata is requested, and passed to predict if provided. The request is ignored if metadata is not provided.
False: metadata is not requested and the meta-estimator will not pass it to predict.
None: metadata is not requested, and the meta-estimator will raise an error if the user provides it.
str: metadata should be passed to the meta-estimator with this given alias instead of the original name.

The default (sklearn.utils.metadata_routing.UNCHANGED) retains the existing request. This allows you to change the request for some parameters and not others.

Added in version 1.3.

Parameters#

xstr, True, False, or None, default=sklearn.utils.metadata_routing.UNCHANGED: Metadata routing for x parameter in predict.

Returns#

selfobject: The updated object.

set_score_request(*, sample_weight='$UNCHANGED$')#

Configure whether metadata should be requested to be passed to the score method.

Note that this method is only relevant when this estimator is used as a sub-estimator within a meta-estimator and metadata routing is enabled with enable_metadata_routing=True (see sklearn.set_config()). Please check the User Guide on how the routing mechanism works.

The options for each parameter are:

True: metadata is requested, and passed to score if provided. The request is ignored if metadata is not provided.
False: metadata is not requested and the meta-estimator will not pass it to score.
None: metadata is not requested, and the meta-estimator will raise an error if the user provides it.
str: metadata should be passed to the meta-estimator with this given alias instead of the original name.

The default (sklearn.utils.metadata_routing.UNCHANGED) retains the existing request. This allows you to change the request for some parameters and not others.

Added in version 1.3.

Parameters#

sample_weightstr, True, False, or None, default=sklearn.utils.metadata_routing.UNCHANGED: Metadata routing for sample_weight parameter in score.

Returns#

selfobject: The updated object.

update(trial)#

Rebuild the model from an Optuna trial’s suggested hyperparameters.

Subclasses should override this to apply trial params to self.model_config and reinitialise the underlying estimator. The base implementation is a no-op.

Parameters:: trial (Any) – An optuna.Trial object.
Return type:: None

Point Forecast Models#

CATBOOSTModel#

CatBoost uses ordered boosting with built-in categorical feature handling and native GPU acceleration. The Twiga wrapper applies MultiOutputRegressor to support multi-horizon forecasting.

CATBOOSTConfig#

Field	Type	Default	Description
`name`	`Literal["catboost"]`	`"catboost"`	Model identifier. Excluded from parameter dumps.
`domain`	`Literal["ml"]`	`"ml"`	Domain identifier. Excluded from parameter dumps.
`task_type`	`Literal["GPU", "CPU"]`	`"CPU"`	Hardware acceleration type.
`random_state`	`int`	`42`	Random seed for reproducibility (must be > 0).
`verbose`	`Literal[0, 1, 2]`	`0`	Verbosity level: 0 (silent), 1 (minimal), 2 (detailed).
`allow_writing_files`	`bool`	`False`	Whether CatBoost may write temporary files during training.
`search_space`	`BaseSearchSpace`	(see below)	Default hyperparameter search space. Excluded from parameter dumps.

Default Search Space:

Parameter	Range	Type	Scale
`learning_rate`	`(1e-3, 1e-1)`	`float`	Log
`depth`	`(1, 12)`	`int`	Linear
`iterations`	`(20, 1000)`	`int`	Linear
`min_data_in_leaf`	`(1, 100)`	`int`	Linear

Underlying estimator: MultiOutputRegressor(CatBoostRegressor(**config.dict()))

GPU acceleration

Set task_type="GPU" to enable CUDA-based training. CatBoost’s GPU implementation supports most regression objectives and can deliver significant speedups on large datasets. Ensure CUDA drivers and the catboost GPU build are installed.

XGBOOSTModel#

XGBoost provides regularized gradient boosting with efficient histogram-based splits. Unlike CatBoost, the Twiga wrapper uses XGBRegressor directly - XGBoost handles multi-output regression natively.

XGBOOSTConfig#

Field	Type	Default	Description
`name`	`Literal["xgboost"]`	`"xgboost"`	Model identifier. Excluded from parameter dumps.
`domain`	`Literal["ml"]`	`"ml"`	Domain identifier. Excluded from parameter dumps.
`device`	`Literal["gpu", "cpu"]`	`"cpu"`	Hardware acceleration type.
`objective`	`Literal["reg:squarederror"]`	`"reg:squarederror"`	Regression objective function.
`random_state`	`int`	`42`	Random seed for reproducibility (must be > 0).
`verbose`	`Literal[0, 1, 2]`	`0`	Verbosity level: 0 (silent), 1 (minimal), 2 (detailed).
`search_space`	`BaseSearchSpace`	(see below)	Default hyperparameter search space. Excluded from parameter dumps.

Default Search Space:

Parameter	Range	Type	Scale
`learning_rate`	`(1e-3, 1e-1)`	`float`	Log
`iterations`	`(20, 1000)`	`int`	Linear
`subsample`	`(0.05, 1.0)`	`float`	Linear
`gamma`	`(0, 10)`	`int`	Linear
`eta`	`(0.1, 1.0)`	`float`	Linear
`colsample_bytree`	`(0.05, 1.0)`	`float`	Linear
`min_child_weight`	`(1, 20)`	`int`	Linear
`n_estimators`	`(10, 500)`	`int`	Linear
`max_depth`	`(1, 10)`	`int`	Linear

Underlying estimator: XGBRegressor(**config.dict())

LIGHTGBMModel#

LightGBM uses histogram-based gradient boosting with leaf-wise tree growth, making it fast on large datasets. Supports GBDT, DART, and GOSS boosting strategies. The Twiga wrapper applies MultiOutputRegressor for multi-horizon output.

LIGHTGBMConfig#

Field	Type	Default	Description
`name`	`Literal["lightgbm"]`	`"lightgbm"`	Model identifier. Excluded from parameter dumps.
`domain`	`Literal["ml"]`	`"ml"`	Domain identifier. Excluded from parameter dumps.
`random_state`	`int`	`42`	Random seed for reproducibility (must be > 0).
`verbose`	`Literal[-1, 0, 1]`	`-1`	Verbosity level: -1 (silent), 0 (warnings), 1 (info).
`boosting_type`	`Literal["gbdt", "dart", "goss"]`	`"gbdt"`	Gradient boosting method.
`objective`	`Literal["regression"]`	`"regression"`	Regression objective.
`metric`	`Literal["rmse"]`	`"rmse"`	Evaluation metric (root mean squared error).
`bagging_freq`	`int`	`1`	Frequency for bagging iterations.
`search_space`	`BaseSearchSpace`	(see below)	Default hyperparameter search space. Excluded from parameter dumps.

Default Search Space:

Parameter	Range	Type	Scale
`learning_rate`	`(1e-3, 1e-1)`	`float`	Log
`num_leaves`	`(2, 1024)`	`int`	Linear
`subsample`	`(0.05, 1.0)`	`float`	Linear
`colsample_bytree`	`(0.05, 1.0)`	`float`	Linear
`min_data_in_leaf`	`(1, 100)`	`int`	Linear
`n_estimators`	`(10, 200)`	`int`	Linear
`max_depth`	`(1, 10)`	`int`	Linear
`linear_tree`	`[True, False]`	`categorical`	-
`iterations`	`(20, 1000)`	`int`	Linear

Underlying estimator: MultiOutputRegressor(LGBMRegressor(**config.dict()))

Boosting strategies

GBDT (Gradient Boosted Decision Trees) - the standard approach, generally the most robust.
DART (Dropouts meet Multiple Additive Regression Trees) - applies dropout to trees for regularisation, can reduce overfitting at the cost of slower training.
GOSS (Gradient-based One-Side Sampling) - keeps instances with large gradients and randomly samples those with small gradients, trading accuracy for speed on very large datasets.

LINEAREGModel#

Ordinary least squares linear regression wrapped in MultiOutputRegressor. Serves as a lightweight baseline for benchmarking more complex models.

LINEAREGConfig#

Field	Type	Default	Description
`name`	`Literal["lineareg"]`	`"lineareg"`	Model identifier. Excluded from parameter dumps.
`domain`	`Literal["ml"]`	`"ml"`	Domain identifier. Excluded from parameter dumps.
`fit_intercept`	`bool`	`True`	Whether to calculate the intercept for this model.
`search_space`	`BaseSearchSpace`	(see below)	Default hyperparameter search space. Excluded from parameter dumps.

Default Search Space:

Parameter	Range	Type	Scale
`fit_intercept`	`[True, False]`	`categorical`	-

Underlying estimator: MultiOutputRegressor(LinearRegression(**config.dict()))

RANDOMFORESTModel#

Random Forest is an ensemble of decision trees trained via bagging with feature subsampling at each split. It is a strong baseline for tabular time series data and supports native multi-output regression, so no MultiOutputRegressor wrapper is required.

RANDOMFORESTConfig#

Field	Type	Default	Description
`name`	`Literal["randomforest"]`	`"randomforest"`	Model identifier. Excluded from parameter dumps.
`domain`	`Literal["ml"]`	`"ml"`	Domain identifier. Excluded from parameter dumps.
`random_state`	`int`	`42`	Random seed for reproducibility (must be > 0).
`n_jobs`	`int`	`-1`	Number of parallel jobs for fitting and prediction. `-1` uses all available CPUs.
`search_space`	`BaseSearchSpace`	(see below)	Default hyperparameter search space. Excluded from parameter dumps.

Default Search Space:

Parameter	Range	Type	Scale
`n_estimators`	`(10, 500)`	`int`	Linear
`max_depth`	`(1, 30)`	`int`	Linear
`min_samples_split`	`(2, 20)`	`int`	Linear
`min_samples_leaf`	`(1, 10)`	`int`	Linear
`max_features`	`["sqrt", "log2", None]`	`categorical`	-

Underlying estimator: RandomForestRegressor(**config.dict()) (native multi-output, no MultiOutputRegressor needed).

Native multi-output support

Unlike most other ML models in Twiga, RandomForestRegressor natively handles multiple output columns. There is no need for MultiOutputRegressor wrapping, which means all output trees are grown jointly and the model can exploit correlations between forecast horizons.

Probabilistic ML Models#

All ML probabilistic models (GaussCatBoost, NGBoost Normal/LogNormal/Exponential, QR CatBoost/XGBoost/LightGBM/RandomForest) are documented on the Probabilistic Forecasting pages:

Distribution Families — GaussCatBoost, NGBoost Normal, NGBoost LogNormal, NGBoost Exponential
Quantile Regression — QR CatBoost, QR XGBoost, QR LightGBM, QR Random Forest

Model Comparison#

The table below covers the point forecast ML models. For the full picture including probabilistic variants see the Model Catalog Overview.

Model	Config Class	Output	Wrapping	GPU
`CATBOOSTModel`	`CATBOOSTConfig`	Point	`MultiOutputRegressor(CatBoostRegressor)`	Yes
`XGBOOSTModel`	`XGBOOSTConfig`	Point	`XGBRegressor` (native multi-output)	Yes
`LIGHTGBMModel`	`LIGHTGBMConfig`	Point	`MultiOutputRegressor(LGBMRegressor)`	No
`LINEAREGModel`	`LINEAREGConfig`	Point	`MultiOutputRegressor(LinearRegression)`	No
`RANDOMFORESTModel`	`RANDOMFORESTConfig`	Point	`RandomForestRegressor` (native multi-output)	No

Usage Example#

The following example shows how to configure, train, and evaluate an ML model using the TwigaForecaster.

import pandas as pd
from sklearn.preprocessing import StandardScaler, RobustScaler

from twiga.core.config import DataPipelineConfig, ForecasterConfig
from twiga.forecaster.core import TwigaForecaster
from twiga.models.ml.catboost_model import CATBOOSTConfig
from twiga.models.ml.xgboost_model import XGBOOSTConfig
from twiga.models.ml.lightgbm_model import LIGHTGBMConfig
from twiga.core.config import BaseSearchSpace

# --- Data pipeline ---
data_config = DataPipelineConfig(
    target_feature="load_mw",
    period="1h",
    lookback_window_size=168,
    forecast_horizon=48,
    calendar_features=["hour", "dayofweek", "month"],
    exogenous_features=["ghi", "temperature"],
    lags=[1, 24, 48, 168],
    windows=[24, 48],
    window_funcs=["mean", "std"],
    input_scaler=StandardScaler(),
    target_scaler=RobustScaler(),
)

# --- Model configs ---
catboost_config = CATBOOSTConfig(task_type="CPU")
xgboost_config = XGBOOSTConfig(device="cpu")
lightgbm_config = LIGHTGBMConfig(boosting_type="gbdt")

# --- Training orchestration ---
train_config = ForecasterConfig(
    split_freq="months",
    train_size=6,
    test_size=1,
    window="expanding",
    project_name="EnergyForecast",
    seed=42,
)

# --- Build and train ---
forecaster = TwigaForecaster(
    data_params=data_config,
    model_params=[catboost_config, xgboost_config, lightgbm_config],
    train_params=train_config,
)

data = pd.read_parquet("data/timeseries.parquet")
train_df = data[data.timestamp <= "2024-06-01"]
test_df = data[data.timestamp > "2024-06-01"]

forecaster.fit(train_df=train_df)
results_df, metrics_df = forecaster.evaluate_point_forecast(test_df=test_df)

print(metrics_df)

Custom Search Space Example#

from twiga.core.config import BaseSearchSpace

custom_search = BaseSearchSpace(
    learning_rate=(0.001, 0.3),
    depth=(3, 10),
    iterations=(100, 500),
)

catboost_config = CATBOOSTConfig(search_space=custom_search)

forecaster = TwigaForecaster(
    data_params=data_config,
    model_params=[catboost_config],
    train_params=train_config,
)

# Tune hyperparameters with Optuna
forecaster.fit(train_df=train_df)
forecaster.tune(train_df=train_df, val_df=val_df, num_trials=50)

# Retrain with best parameters
forecaster.fit(train_df=train_df)

See Hyperparameter Tuning for advanced tuning recipes including warm-starting, pruning strategies, and multi-objective optimisation.

Class Reference#

Autodoc: Base Classes#

class twiga.models.ml.core.base_regressor.BaseRegressor(data_pipeline=None)

Bases: BaseEstimator, RegressorMixin

A base class for regression models compliant with scikit-learn pipelines.

Provides common functionality for feature formatting, model fitting, and forecasting.

Variables:

model (Any | None) – The regression model to be used. Must implement fit and predict methods.
data_pipeline (Any | None) – Optional pipeline to preprocess data.
num_targets (int | None) – The number of target variables.
feature_names (list[str] | None) – Column names captured at fit time (sklearn convention). Set when X is a DataFrame; otherwise synthetic names f0, f1, ... are used. Used at predict time to avoid sklearn feature-name mismatch warnings.

__init__(data_pipeline=None)

Initialize the BaseRegressor with an optional data pipeline.

Parameters:: data_pipeline (Any | None) – Data pipeline for preprocessing (default is None).

Example

>>> reg = BaseRegressor()

fit(X, y, eval_set=None, verbose=False)

Fit the regression model to the training data.

Parameters:

X (ndarray) – Training input features.
y (ndarray) – Target values corresponding to the training inputs.
eval_set (tuple[ndarray, ndarray] | None) – Optional (X_val, y_val) tuple for early stopping. Passed through to subclass implementations that support it.
verbose (bool) – Flag to control verbosity (default is False).

Return type:

BaseRegressor

Returns:

BaseRegressor – The instance itself.

Raises:

ValueError – If no model has been set prior to calling fit.

Example

>>> X = np.random.rand(10, 5, 3)
>>> y = np.random.rand(10, 5, 2)
>>> reg = BaseRegressor()
>>> reg.model = SomeModel()  # assign a model implementing fit/predict
>>> reg.fit(X, y, verbose=True)

forecast(x)

Forecast output using the fitted regression model.

Parameters:: x (ndarray) – Input features for forecasting.
Return type:: ndarray | dict
Returns:: dict – A dictionary containing the predicted values with key “loc”.

Example

>>> x = np.random.rand(10, 5, 3)
>>> reg = BaseRegressor()
>>> reg.model = SomeModel()  # assign a model implementing predict
>>> forecast_output = reg.forecast(x)
>>> "loc" in forecast_output
True

format_features(x)

Format the input features by flattening multi-dimensional arrays into 2D arrays.

Parameters:: x (ndarray | DataFrame) – Input features. If X has more than 2 dimensions, it is reshaped to have shape (samples, features).
Return type:: ndarray
Returns:: np.ndarray – The formatted 2D feature array.

Example

>>> x = np.random.rand(10, 5, 3)
>>> reg = BaseRegressor()
>>> x_formatted = reg.format_features(x)
>>> x_formatted.shape
(10, 15)

get_params(deep=True)

Return the model parameters.

Parameters:: deep (bool) – If True, return parameters for this estimator and its subobjects.
Return type:: dict
Returns:: dict – A dictionary of parameters.

Example

>>> reg = BaseRegressor()
>>> params = reg.get_params()

predict(x)

Predict output using the fitted regression model.

Parameters:: x (ndarray) – Input features for prediction. Expected to be 3D (samples, time_steps, channels).
Return type:: ndarray
Returns:: np.ndarray – Predicted values, reshaped to (samples, time_steps, num_targets).
Raises:: ValueError – If no model has been set or if the input array is not 3-dimensional.

Example

>>> x = np.random.rand(10, 5, 3)
>>> reg = BaseRegressor()
>>> reg.model = SomeModel()  # assign a model implementing predict
>>> predictions = reg.predict(X)
>>> predictions.shape  # Expected: (10, 5, num_targets)

set_fit_request(*, eval_set='$UNCHANGED$', verbose='$UNCHANGED$')

Configure whether metadata should be requested to be passed to the fit method.

Note that this method is only relevant when this estimator is used as a sub-estimator within a meta-estimator and metadata routing is enabled with enable_metadata_routing=True (see sklearn.set_config()). Please check the User Guide on how the routing mechanism works.

The options for each parameter are:

True: metadata is requested, and passed to fit if provided. The request is ignored if metadata is not provided.
False: metadata is not requested and the meta-estimator will not pass it to fit.
None: metadata is not requested, and the meta-estimator will raise an error if the user provides it.
str: metadata should be passed to the meta-estimator with this given alias instead of the original name.

The default (sklearn.utils.metadata_routing.UNCHANGED) retains the existing request. This allows you to change the request for some parameters and not others.

Added in version 1.3.

Parameters#

eval_setstr, True, False, or None, default=sklearn.utils.metadata_routing.UNCHANGED: Metadata routing for eval_set parameter in fit.
verbosestr, True, False, or None, default=sklearn.utils.metadata_routing.UNCHANGED: Metadata routing for verbose parameter in fit.

Returns#

selfobject: The updated object.

set_params(**params)

Set the model parameters.

Parameters:: **params (Any) – Parameters to set in the model.
Return type:: BaseRegressor
Returns:: BaseRegressor – The instance with updated parameters.

Example

>>> reg = BaseRegressor()
>>> reg = reg.set_params(model=SomeModel())

set_predict_request(*, x='$UNCHANGED$')

Configure whether metadata should be requested to be passed to the predict method.

Note that this method is only relevant when this estimator is used as a sub-estimator within a meta-estimator and metadata routing is enabled with enable_metadata_routing=True (see sklearn.set_config()). Please check the User Guide on how the routing mechanism works.

The options for each parameter are:

True: metadata is requested, and passed to predict if provided. The request is ignored if metadata is not provided.
False: metadata is not requested and the meta-estimator will not pass it to predict.
None: metadata is not requested, and the meta-estimator will raise an error if the user provides it.
str: metadata should be passed to the meta-estimator with this given alias instead of the original name.

The default (sklearn.utils.metadata_routing.UNCHANGED) retains the existing request. This allows you to change the request for some parameters and not others.

Added in version 1.3.

Parameters#

xstr, True, False, or None, default=sklearn.utils.metadata_routing.UNCHANGED: Metadata routing for x parameter in predict.

Returns#

selfobject: The updated object.

set_score_request(*, sample_weight='$UNCHANGED$')

Configure whether metadata should be requested to be passed to the score method.

Note that this method is only relevant when this estimator is used as a sub-estimator within a meta-estimator and metadata routing is enabled with enable_metadata_routing=True (see sklearn.set_config()). Please check the User Guide on how the routing mechanism works.

The options for each parameter are:

True: metadata is requested, and passed to score if provided. The request is ignored if metadata is not provided.
False: metadata is not requested and the meta-estimator will not pass it to score.
None: metadata is not requested, and the meta-estimator will raise an error if the user provides it.
str: metadata should be passed to the meta-estimator with this given alias instead of the original name.

The default (sklearn.utils.metadata_routing.UNCHANGED) retains the existing request. This allows you to change the request for some parameters and not others.

Added in version 1.3.

Parameters#

sample_weightstr, True, False, or None, default=sklearn.utils.metadata_routing.UNCHANGED: Metadata routing for sample_weight parameter in score.

Returns#

selfobject: The updated object.

update(trial)

Rebuild the model from an Optuna trial’s suggested hyperparameters.

Subclasses should override this to apply trial params to self.model_config and reinitialise the underlying estimator. The base implementation is a no-op.

Parameters:: trial (Any) – An optuna.Trial object.
Return type:: None

class twiga.models.ml.prob.base_quantile.BaseQuantileRegressor(data_pipeline=None, model_instance=None, model_config=None, quantiles=None, conf_level=0.05)

Bases: BaseRegressor

Base class for quantile regression models compatible with scikit-learn pipelines.

Supports multiple quantile models for probabilistic forecasting using any regression model (e.g., LightGBM, XGBoost, CatBoost). Automatically handles multi-output regression depending on model capabilities.

Variables:

data_pipeline (Any | None) – Optional preprocessing pipeline.
model_instance (Any) – Regression model class.
model_config (dict) – Configuration dictionary for the model.
quantiles (list[float]) – List of quantiles including confidence interval bounds.
conf_level (float) – Confidence level for interval predictions.
supports_multi_output (bool) – Whether the model supports multi-output regression.
models (dict[float, Any]) – Mapping from quantile to its corresponding trained model.
num_targets (int | None) – Number of output targets for multi-output cases.
horizon (int) – Prediction horizon length.

__init__(data_pipeline=None, model_instance=None, model_config=None, quantiles=None, conf_level=0.05)

Initializes the BaseQuantileRegressor.

Parameters:

data_pipeline (Any | None) – Optional feature preprocessing pipeline.
model_instance (Any) – Regression model class (e.g., LGBMRegressor).
model_config (dict | None) – Parameters to initialize the model.
quantiles (list[float] | None) – List of quantiles to model (e.g., [0.25, 0.5, 0.75]). Defaults to [0.25, 0.5, 0.75, 0.95] plus confidence bounds.
conf_level (float) – Confidence level for interval bounds (e.g., 0.05 for 95% CI).

fit(X, y, eval_set=None, verbose=False)

Fit the regression model to the training data.

Parameters:

X (ndarray) – Training input features.
y (ndarray) – Target values corresponding to the training inputs.
eval_set (tuple[ndarray, ndarray] | None) – Optional (X_val, y_val) for early stopping (reserved for future per-backend support; currently accepted but not forwarded).
verbose (bool) – Flag to control verbosity (default is False).

Return type:

BaseQuantileRegressor

Returns:

BaseQuantileRegressor – The instance itself.

Raises:

ValueError – If no model has been set prior to calling fit.

Example

>>> X = np.random.rand(10, 5, 3)
>>> y = np.random.rand(10, 5, 2)
>>> reg = BaseQuantileRegressor(model_instance=LinearRegression)
>>> reg.fit(X, y, verbose=True)

forecast(x, sigma=False)

Forecast output using the fitted regression model.

Parameters:

x (ndarray) – Input features for forecasting.
sigma (bool) – Whether to return sigma predictions (default is False).

Return type:

dict

Returns:

dict – A dictionary containing the predicted values with key “loc”.

Example

>>> x = np.random.rand(10, 5, 3)
>>> reg = BaseRegressor()
>>> reg.model = SomeModel()  # assign a model implementing predict
>>> forecast_output = reg.forecast(x)
>>> "loc" in forecast_output
True

get_median_quantile(predictions)

Get median quantile predictions for the input data.

Parameters:: predictions (ndarray) – Input features of shape (n_samples, n_features).
Return type:: ndarray
Returns:: np.ndarray – Median quantile predictions of shape (n_samples, n_targets).

get_sigma_quantile(predictions)

Get sigma quantile predictions for the input data.

Parameters:: predictions (ndarray) – Input features of shape (n_samples, n_features).
Return type:: ndarray
Returns:: np.ndarray – Sigma quantile predictions of shape (n_samples, n_targets).

predict(X, sigma=False)

Predict quantile values and optionally sigma for the input data.

Parameters:

X (ndarray) – Input features of shape (n_samples, seq_len, n_features).
sigma (bool) – Whether to return sigma predictions (default is True).

Returns:

tuple – Median predictions and either sigma predictions or full quantile predictions.

set_fit_request(*, eval_set='$UNCHANGED$', verbose='$UNCHANGED$')

Configure whether metadata should be requested to be passed to the fit method.

Note that this method is only relevant when this estimator is used as a sub-estimator within a meta-estimator and metadata routing is enabled with enable_metadata_routing=True (see sklearn.set_config()). Please check the User Guide on how the routing mechanism works.

The options for each parameter are:

True: metadata is requested, and passed to fit if provided. The request is ignored if metadata is not provided.
False: metadata is not requested and the meta-estimator will not pass it to fit.
None: metadata is not requested, and the meta-estimator will raise an error if the user provides it.
str: metadata should be passed to the meta-estimator with this given alias instead of the original name.

The default (sklearn.utils.metadata_routing.UNCHANGED) retains the existing request. This allows you to change the request for some parameters and not others.

Added in version 1.3.

Parameters#

eval_setstr, True, False, or None, default=sklearn.utils.metadata_routing.UNCHANGED: Metadata routing for eval_set parameter in fit.
verbosestr, True, False, or None, default=sklearn.utils.metadata_routing.UNCHANGED: Metadata routing for verbose parameter in fit.

Returns#

selfobject: The updated object.

set_predict_request(*, sigma='$UNCHANGED$')

Configure whether metadata should be requested to be passed to the predict method.

Note that this method is only relevant when this estimator is used as a sub-estimator within a meta-estimator and metadata routing is enabled with enable_metadata_routing=True (see sklearn.set_config()). Please check the User Guide on how the routing mechanism works.

The options for each parameter are:

True: metadata is requested, and passed to predict if provided. The request is ignored if metadata is not provided.
False: metadata is not requested and the meta-estimator will not pass it to predict.
None: metadata is not requested, and the meta-estimator will raise an error if the user provides it.
str: metadata should be passed to the meta-estimator with this given alias instead of the original name.

The default (sklearn.utils.metadata_routing.UNCHANGED) retains the existing request. This allows you to change the request for some parameters and not others.

Added in version 1.3.

Parameters#

sigmastr, True, False, or None, default=sklearn.utils.metadata_routing.UNCHANGED: Metadata routing for sigma parameter in predict.

Returns#

selfobject: The updated object.

set_score_request(*, sample_weight='$UNCHANGED$')

Configure whether metadata should be requested to be passed to the score method.

Note that this method is only relevant when this estimator is used as a sub-estimator within a meta-estimator and metadata routing is enabled with enable_metadata_routing=True (see sklearn.set_config()). Please check the User Guide on how the routing mechanism works.

The options for each parameter are:

True: metadata is requested, and passed to score if provided. The request is ignored if metadata is not provided.
False: metadata is not requested and the meta-estimator will not pass it to score.
None: metadata is not requested, and the meta-estimator will raise an error if the user provides it.
str: metadata should be passed to the meta-estimator with this given alias instead of the original name.

The default (sklearn.utils.metadata_routing.UNCHANGED) retains the existing request. This allows you to change the request for some parameters and not others.

Added in version 1.3.

Parameters#

sample_weightstr, True, False, or None, default=sklearn.utils.metadata_routing.UNCHANGED: Metadata routing for sample_weight parameter in score.

Returns#

selfobject: The updated object.

update(trial)

Update the quantile regression model using hyperparameter suggestions.

The model is re-instantiated with new parameters obtained from the configuration’s search space.

Parameters:: trial – An Optuna trial object used to sample hyperparameters.

Autodoc: Point Forecast Models#

class twiga.models.ml.catboost_model.CATBOOSTConfig(**data)#

Bases: BaseModelConfig

Configuration model for CatBoost algorithms.

Variables:

name – Identifier for the model type, fixed to “catboost”.
task_type – Hardware acceleration type (“GPU” or “CPU”).
random_state – Seed for random number generation.
verbose – Verbosity level (0 silent, 1 minimal, 2 detailed).
allow_writing_files – Whether the model may write files to disk.
l2_leaf_reg – L2 regularisation coefficient on leaf values.
bagging_temperature – Bayesian bootstrap temperature; 1.0 = uniform, 0.0 = no bagging. Higher values add more variance.
search_space – Optuna hyperparameter search space.

allow_writing_files: bool#

bagging_temperature: float#

domain: Literal['ml']#

l2_leaf_reg: float#

loss_function: Literal['MultiRMSE']#

model_config: ClassVar[ConfigDict] = {'extra': 'allow'}#: Configuration for the model, should be a dictionary conforming to [ConfigDict][pydantic.config.ConfigDict].

name: Literal['catboost']#

random_state: int#

search_space: BaseSearchSpace#

task_type: Literal['GPU', 'CPU']#

verbose: Literal[0, 1, 2]#

class twiga.models.ml.catboost_model.CATBOOSTModel(model_config=None)#

Bases: BaseRegressor

Multi-output CatBoost regression model.

Wraps CatBoostRegressor in a MultiOutputRegressor to support multi-step-ahead point forecasting.

Parameters:: model_config (CATBOOSTConfig | None) – CatBoost configuration. Defaults to CATBOOSTConfig.

set_fit_request(*, eval_set='$UNCHANGED$', verbose='$UNCHANGED$')#

Configure whether metadata should be requested to be passed to the fit method.

Note that this method is only relevant when this estimator is used as a sub-estimator within a meta-estimator and metadata routing is enabled with enable_metadata_routing=True (see sklearn.set_config()). Please check the User Guide on how the routing mechanism works.

The options for each parameter are:

True: metadata is requested, and passed to fit if provided. The request is ignored if metadata is not provided.
False: metadata is not requested and the meta-estimator will not pass it to fit.
None: metadata is not requested, and the meta-estimator will raise an error if the user provides it.
str: metadata should be passed to the meta-estimator with this given alias instead of the original name.

The default (sklearn.utils.metadata_routing.UNCHANGED) retains the existing request. This allows you to change the request for some parameters and not others.

Added in version 1.3.

Parameters#

eval_setstr, True, False, or None, default=sklearn.utils.metadata_routing.UNCHANGED: Metadata routing for eval_set parameter in fit.
verbosestr, True, False, or None, default=sklearn.utils.metadata_routing.UNCHANGED: Metadata routing for verbose parameter in fit.

Returns#

selfobject: The updated object.

set_predict_request(*, x='$UNCHANGED$')#

Configure whether metadata should be requested to be passed to the predict method.

Note that this method is only relevant when this estimator is used as a sub-estimator within a meta-estimator and metadata routing is enabled with enable_metadata_routing=True (see sklearn.set_config()). Please check the User Guide on how the routing mechanism works.

The options for each parameter are:

True: metadata is requested, and passed to predict if provided. The request is ignored if metadata is not provided.
False: metadata is not requested and the meta-estimator will not pass it to predict.
None: metadata is not requested, and the meta-estimator will raise an error if the user provides it.
str: metadata should be passed to the meta-estimator with this given alias instead of the original name.

The default (sklearn.utils.metadata_routing.UNCHANGED) retains the existing request. This allows you to change the request for some parameters and not others.

Added in version 1.3.

Parameters#

xstr, True, False, or None, default=sklearn.utils.metadata_routing.UNCHANGED: Metadata routing for x parameter in predict.

Returns#

selfobject: The updated object.

set_score_request(*, sample_weight='$UNCHANGED$')#

Configure whether metadata should be requested to be passed to the score method.

Note that this method is only relevant when this estimator is used as a sub-estimator within a meta-estimator and metadata routing is enabled with enable_metadata_routing=True (see sklearn.set_config()). Please check the User Guide on how the routing mechanism works.

The options for each parameter are:

True: metadata is requested, and passed to score if provided. The request is ignored if metadata is not provided.
False: metadata is not requested and the meta-estimator will not pass it to score.
None: metadata is not requested, and the meta-estimator will raise an error if the user provides it.
str: metadata should be passed to the meta-estimator with this given alias instead of the original name.

The default (sklearn.utils.metadata_routing.UNCHANGED) retains the existing request. This allows you to change the request for some parameters and not others.

Added in version 1.3.

Parameters#

sample_weightstr, True, False, or None, default=sklearn.utils.metadata_routing.UNCHANGED: Metadata routing for sample_weight parameter in score.

Returns#

selfobject: The updated object.

update(trial)#

Rebuild model from an Optuna trial’s suggested hyperparameters.

Return type:: None

class twiga.models.ml.lightgbm_model.LIGHTGBMConfig(**data)#

Bases: BaseModelConfig

Configuration for LightGBM point forecasting.

Variables:

name – Model identifier fixed to "lightgbm".
domain – Domain fixed to "ml". Excluded from tuning.
random_state – Seed for reproducibility.
verbose – LightGBM verbosity (-1 silent, 0 warnings, 1 info).
boosting_type – Gradient boosting method.
objective – LightGBM regression objective.
metric – Evaluation metric used when an eval set is provided.
bagging_freq – Enable bagging by setting to a positive integer (bagging performed every bagging_freq iterations).
early_stopping_rounds – Stop training if the validation metric does not improve for this many rounds. Only activates when an eval_set is passed to LIGHTGBMModel.fit(). Excluded from model params (passed via LightGBM callbacks).
search_space – Optuna hyperparameter search space.

bagging_freq: int#

boosting_type: Literal['gbdt', 'dart', 'goss']#

domain: Literal['ml']#

early_stopping_rounds: int#

metric: Literal['rmse']#

model_config: ClassVar[ConfigDict] = {'extra': 'allow'}#: Configuration for the model, should be a dictionary conforming to [ConfigDict][pydantic.config.ConfigDict].

name: Literal['lightgbm']#

objective: Literal['regression']#

random_state: int#

search_space: BaseSearchSpace#

verbose: Literal[-1, 0, 1]#

class twiga.models.ml.lightgbm_model.LIGHTGBMModel(model_config=None)#

Bases: BaseRegressor

Multi-output LightGBM regression model.

Fits one LGBMRegressor per output step (horizon), enabling per-target early stopping via LightGBM callbacks when a validation set is provided.

Parameters:: model_config (LIGHTGBMConfig | None) – LightGBM configuration. Defaults to LIGHTGBMConfig.

fit(X, y, eval_set=None, verbose=False)#

Fit one LGBMRegressor per output step.

Parameters:

X (ndarray) – Shape (B, L, F) - batch × sequence × features.
y (ndarray) – Shape (B, L, H) - batch × sequence × horizons.
eval_set (tuple[ndarray, ndarray] | None) – Optional (X_val, y_val) for early stopping. Each per-output model receives its corresponding validation column and uses lightgbm.early_stopping() with the early_stopping_rounds from the config.
verbose (bool) – Whether to print LightGBM training logs.

Return type:

LIGHTGBMModel

Returns:

Self for method chaining.

models: list[lightgbm.LGBMRegressor]#

num_targets: int | None#

predict(x)#

Predict output using fitted per-output models.

Parameters:: x (ndarray) – Shape (B, L, F).
Return type:: ndarray
Returns:: Predictions of shape (B, L, H).
Raises:: ValueError – If the model has not been fitted or x is not 3-D.

set_fit_request(*, eval_set='$UNCHANGED$', verbose='$UNCHANGED$')#

Configure whether metadata should be requested to be passed to the fit method.

Note that this method is only relevant when this estimator is used as a sub-estimator within a meta-estimator and metadata routing is enabled with enable_metadata_routing=True (see sklearn.set_config()). Please check the User Guide on how the routing mechanism works.

The options for each parameter are:

True: metadata is requested, and passed to fit if provided. The request is ignored if metadata is not provided.
False: metadata is not requested and the meta-estimator will not pass it to fit.
None: metadata is not requested, and the meta-estimator will raise an error if the user provides it.
str: metadata should be passed to the meta-estimator with this given alias instead of the original name.

The default (sklearn.utils.metadata_routing.UNCHANGED) retains the existing request. This allows you to change the request for some parameters and not others.

Added in version 1.3.

Parameters#

eval_setstr, True, False, or None, default=sklearn.utils.metadata_routing.UNCHANGED: Metadata routing for eval_set parameter in fit.
verbosestr, True, False, or None, default=sklearn.utils.metadata_routing.UNCHANGED: Metadata routing for verbose parameter in fit.

Returns#

selfobject: The updated object.

set_predict_request(*, x='$UNCHANGED$')#

Configure whether metadata should be requested to be passed to the predict method.

Note that this method is only relevant when this estimator is used as a sub-estimator within a meta-estimator and metadata routing is enabled with enable_metadata_routing=True (see sklearn.set_config()). Please check the User Guide on how the routing mechanism works.

The options for each parameter are:

True: metadata is requested, and passed to predict if provided. The request is ignored if metadata is not provided.
False: metadata is not requested and the meta-estimator will not pass it to predict.
None: metadata is not requested, and the meta-estimator will raise an error if the user provides it.
str: metadata should be passed to the meta-estimator with this given alias instead of the original name.

The default (sklearn.utils.metadata_routing.UNCHANGED) retains the existing request. This allows you to change the request for some parameters and not others.

Added in version 1.3.

Parameters#

xstr, True, False, or None, default=sklearn.utils.metadata_routing.UNCHANGED: Metadata routing for x parameter in predict.

Returns#

selfobject: The updated object.

set_score_request(*, sample_weight='$UNCHANGED$')#

Configure whether metadata should be requested to be passed to the score method.

Note that this method is only relevant when this estimator is used as a sub-estimator within a meta-estimator and metadata routing is enabled with enable_metadata_routing=True (see sklearn.set_config()). Please check the User Guide on how the routing mechanism works.

The options for each parameter are:

True: metadata is requested, and passed to score if provided. The request is ignored if metadata is not provided.
False: metadata is not requested and the meta-estimator will not pass it to score.
None: metadata is not requested, and the meta-estimator will raise an error if the user provides it.
str: metadata should be passed to the meta-estimator with this given alias instead of the original name.

The default (sklearn.utils.metadata_routing.UNCHANGED) retains the existing request. This allows you to change the request for some parameters and not others.

Added in version 1.3.

Parameters#

sample_weightstr, True, False, or None, default=sklearn.utils.metadata_routing.UNCHANGED: Metadata routing for sample_weight parameter in score.

Returns#

selfobject: The updated object.

update(trial)#

Rebuild models from an Optuna trial’s suggested hyperparameters.

Stores the sampled params; they are applied on the next fit() call.

Return type:: None

class twiga.models.ml.xgboost_model.XGBOOSTConfig(**data)#

Bases: BaseModelConfig

Configuration for XGBoost point forecasting.

Variables:

name – Model identifier fixed to "xgboost".
domain – Domain fixed to "ml". Excluded from tuning.
random_state – Seed for reproducibility.
verbose – Verbosity level (0 silent, 1 minimal, 2 detailed).
device – Hardware target ("cpu" or "gpu").
tree_method – Tree construction algorithm. "hist" is required for native multi-output and GPU training (XGBoost 3.x default).
objective – XGBoost regression objective.
early_stopping_rounds – Stop training if validation metric does not improve for this many consecutive rounds. Only activates when an eval_set is passed to XGBOOSTModel.fit().
search_space – Optuna hyperparameter search space.

device: Literal['cpu', 'gpu']#

domain: Literal['ml']#

early_stopping_rounds: int#

model_config: ClassVar[ConfigDict] = {'extra': 'allow'}#: Configuration for the model, should be a dictionary conforming to [ConfigDict][pydantic.config.ConfigDict].

name: Literal['xgboost']#

objective: Literal['reg:squarederror']#

random_state: int#

search_space: BaseSearchSpace#

tree_method: Literal['hist', 'approx', 'exact']#

verbose: Literal[0, 1, 2]#

class twiga.models.ml.xgboost_model.XGBOOSTModel(model_config=None)#

Bases: BaseRegressor

Multi-output XGBoost regression model.

Wraps XGBRegressor using native multi-output support (tree_method="hist"). Accepts an optional validation set to activate early stopping during training.

Parameters:: model_config (XGBOOSTConfig | None) – XGBoost configuration. Defaults to XGBOOSTConfig.

fit(X, y, eval_set=None, verbose=False)#

Fit the XGBoost model, optionally with early stopping.

Parameters:

X (ndarray) – Shape (B, L, F) - batch × sequence × features.
y (ndarray) – Shape (B, L, H) - batch × sequence × horizons.
eval_set (tuple[ndarray, ndarray] | None) – Optional (X_val, y_val) for early stopping. When provided, training stops if the validation metric does not improve for early_stopping_rounds consecutive rounds.
verbose (bool) – Whether to print XGBoost training logs.

Return type:

XGBOOSTModel

Returns:

Self for method chaining.

set_fit_request(*, eval_set='$UNCHANGED$', verbose='$UNCHANGED$')#

Configure whether metadata should be requested to be passed to the fit method.

Note that this method is only relevant when this estimator is used as a sub-estimator within a meta-estimator and metadata routing is enabled with enable_metadata_routing=True (see sklearn.set_config()). Please check the User Guide on how the routing mechanism works.

The options for each parameter are:

True: metadata is requested, and passed to fit if provided. The request is ignored if metadata is not provided.
False: metadata is not requested and the meta-estimator will not pass it to fit.
None: metadata is not requested, and the meta-estimator will raise an error if the user provides it.
str: metadata should be passed to the meta-estimator with this given alias instead of the original name.

The default (sklearn.utils.metadata_routing.UNCHANGED) retains the existing request. This allows you to change the request for some parameters and not others.

Added in version 1.3.

Parameters#

eval_setstr, True, False, or None, default=sklearn.utils.metadata_routing.UNCHANGED: Metadata routing for eval_set parameter in fit.
verbosestr, True, False, or None, default=sklearn.utils.metadata_routing.UNCHANGED: Metadata routing for verbose parameter in fit.

Returns#

selfobject: The updated object.

set_predict_request(*, x='$UNCHANGED$')#

Configure whether metadata should be requested to be passed to the predict method.

Note that this method is only relevant when this estimator is used as a sub-estimator within a meta-estimator and metadata routing is enabled with enable_metadata_routing=True (see sklearn.set_config()). Please check the User Guide on how the routing mechanism works.

The options for each parameter are:

True: metadata is requested, and passed to predict if provided. The request is ignored if metadata is not provided.
False: metadata is not requested and the meta-estimator will not pass it to predict.
None: metadata is not requested, and the meta-estimator will raise an error if the user provides it.
str: metadata should be passed to the meta-estimator with this given alias instead of the original name.

The default (sklearn.utils.metadata_routing.UNCHANGED) retains the existing request. This allows you to change the request for some parameters and not others.

Added in version 1.3.

Parameters#

xstr, True, False, or None, default=sklearn.utils.metadata_routing.UNCHANGED: Metadata routing for x parameter in predict.

Returns#

selfobject: The updated object.

set_score_request(*, sample_weight='$UNCHANGED$')#

Configure whether metadata should be requested to be passed to the score method.

Note that this method is only relevant when this estimator is used as a sub-estimator within a meta-estimator and metadata routing is enabled with enable_metadata_routing=True (see sklearn.set_config()). Please check the User Guide on how the routing mechanism works.

The options for each parameter are:

True: metadata is requested, and passed to score if provided. The request is ignored if metadata is not provided.
False: metadata is not requested and the meta-estimator will not pass it to score.
None: metadata is not requested, and the meta-estimator will raise an error if the user provides it.
str: metadata should be passed to the meta-estimator with this given alias instead of the original name.

The default (sklearn.utils.metadata_routing.UNCHANGED) retains the existing request. This allows you to change the request for some parameters and not others.

Added in version 1.3.

Parameters#

sample_weightstr, True, False, or None, default=sklearn.utils.metadata_routing.UNCHANGED: Metadata routing for sample_weight parameter in score.

Returns#

selfobject: The updated object.

update(trial)#

Rebuild model from an Optuna trial’s suggested hyperparameters.

Return type:: None

class twiga.models.ml.randomforest_model.RANDOMFORESTConfig(**data)#

Bases: BaseModelConfig

Configuration for Random Forest point forecasting.

Variables:

name – Model identifier fixed to "randomforest".
domain – Domain fixed to "ml". Excluded from tuning.
random_state – Seed for reproducibility.
n_jobs – Number of parallel jobs during fit and predict. -1 uses all available CPUs.
search_space – Optuna hyperparameter search space.

domain: Literal['ml']#

model_config: ClassVar[ConfigDict] = {'extra': 'allow'}#: Configuration for the model, should be a dictionary conforming to [ConfigDict][pydantic.config.ConfigDict].

n_jobs: int#

name: Literal['randomforest']#

random_state: int#

search_space: BaseSearchSpace#

class twiga.models.ml.randomforest_model.RANDOMFORESTModel(model_config=None)#

Bases: BaseRegressor

Multi-output Random Forest regression model.

Wraps RandomForestRegressor which natively handles multi-output regression without requiring MultiOutputRegressor.

Random Forest is a strong non-parametric baseline with built-in feature importance, requires no scaling, and is robust to outliers. It does not support early stopping; the eval_set argument is accepted but silently ignored.

Parameters:: model_config (RANDOMFORESTConfig | None) – RF configuration. Defaults to RANDOMFORESTConfig.

fit(X, y, eval_set=None, verbose=False)#

Fit the Random Forest model.

Parameters:

X (ndarray) – Shape (B, L, F) - batch × sequence × features.
y (ndarray) – Shape (B, L, H) - batch × sequence × horizons.
eval_set (tuple[ndarray, ndarray] | None) – Accepted for API compatibility; ignored (RF has no early stopping).
verbose (bool) – Unused; included for interface consistency.

Return type:

RANDOMFORESTModel

Returns:

Self for method chaining.

set_fit_request(*, eval_set='$UNCHANGED$', verbose='$UNCHANGED$')#

Configure whether metadata should be requested to be passed to the fit method.

Note that this method is only relevant when this estimator is used as a sub-estimator within a meta-estimator and metadata routing is enabled with enable_metadata_routing=True (see sklearn.set_config()). Please check the User Guide on how the routing mechanism works.

The options for each parameter are:

True: metadata is requested, and passed to fit if provided. The request is ignored if metadata is not provided.
False: metadata is not requested and the meta-estimator will not pass it to fit.
None: metadata is not requested, and the meta-estimator will raise an error if the user provides it.
str: metadata should be passed to the meta-estimator with this given alias instead of the original name.

The default (sklearn.utils.metadata_routing.UNCHANGED) retains the existing request. This allows you to change the request for some parameters and not others.

Added in version 1.3.

Parameters#

eval_setstr, True, False, or None, default=sklearn.utils.metadata_routing.UNCHANGED: Metadata routing for eval_set parameter in fit.
verbosestr, True, False, or None, default=sklearn.utils.metadata_routing.UNCHANGED: Metadata routing for verbose parameter in fit.

Returns#

selfobject: The updated object.

set_predict_request(*, x='$UNCHANGED$')#

Configure whether metadata should be requested to be passed to the predict method.

Note that this method is only relevant when this estimator is used as a sub-estimator within a meta-estimator and metadata routing is enabled with enable_metadata_routing=True (see sklearn.set_config()). Please check the User Guide on how the routing mechanism works.

The options for each parameter are:

True: metadata is requested, and passed to predict if provided. The request is ignored if metadata is not provided.
False: metadata is not requested and the meta-estimator will not pass it to predict.
None: metadata is not requested, and the meta-estimator will raise an error if the user provides it.
str: metadata should be passed to the meta-estimator with this given alias instead of the original name.

The default (sklearn.utils.metadata_routing.UNCHANGED) retains the existing request. This allows you to change the request for some parameters and not others.

Added in version 1.3.

Parameters#

xstr, True, False, or None, default=sklearn.utils.metadata_routing.UNCHANGED: Metadata routing for x parameter in predict.

Returns#

selfobject: The updated object.

set_score_request(*, sample_weight='$UNCHANGED$')#

Configure whether metadata should be requested to be passed to the score method.

Note that this method is only relevant when this estimator is used as a sub-estimator within a meta-estimator and metadata routing is enabled with enable_metadata_routing=True (see sklearn.set_config()). Please check the User Guide on how the routing mechanism works.

The options for each parameter are:

True: metadata is requested, and passed to score if provided. The request is ignored if metadata is not provided.
False: metadata is not requested and the meta-estimator will not pass it to score.
None: metadata is not requested, and the meta-estimator will raise an error if the user provides it.
str: metadata should be passed to the meta-estimator with this given alias instead of the original name.

The default (sklearn.utils.metadata_routing.UNCHANGED) retains the existing request. This allows you to change the request for some parameters and not others.

Added in version 1.3.

Parameters#

sample_weightstr, True, False, or None, default=sklearn.utils.metadata_routing.UNCHANGED: Metadata routing for sample_weight parameter in score.

Returns#

selfobject: The updated object.

update(trial)#

Rebuild model from an Optuna trial’s suggested hyperparameters.

Return type:: None

class twiga.models.ml.lineareg_model.LINEAREGConfig(**data)#

Bases: BaseModelConfig

Configuration model for Linear Regression algorithms.

This class extends BaseModelConfig with parameters specific to Linear Regression, enabling configuration for basic regression tasks. It includes fixed parameters and a hyperparameter search space for tuning key parameters.

Variables:

name (Literal["linear_regression"]) – Model identifier fixed to “linear_regression”. Input can be provided using the alias “model_name”.
domain (Literal["ml"]) – Model domain fixed to “ml”. Excluded from parameter tuning.
fit_intercept (bool) – Whether to calculate the intercept for this model.
search_space (BaseSearchSpace) – Hyperparameter search space for optimization.

domain: Literal['ml']#

fit_intercept: bool#

model_config: ClassVar[ConfigDict] = {'extra': 'allow'}#: Configuration for the model, should be a dictionary conforming to [ConfigDict][pydantic.config.ConfigDict].

name: Literal['lineareg']#

search_space: BaseSearchSpace#

class twiga.models.ml.lineareg_model.LINEAREGModel(model_config=None)#

Bases: BaseRegressor

Linear Regression model with multi-output support.

This class provides an interface for initializing and updating a LinearRegression model for regression tasks. It uses a configuration model (LinearRegressionConfig) to manage hyperparameters and settings.

Parameters:

model_config (LINEAREGConfig | None) – Configuration for Linear Regression. If None, the default configuration is used.

Variables:

model_config (LinearRegressionConfig) – The configuration object for Linear Regression.
model (MultiOutputRegressor) – The instantiated Linear Regression model wrapped for multi-output regression.

set_fit_request(*, eval_set='$UNCHANGED$', verbose='$UNCHANGED$')#

Configure whether metadata should be requested to be passed to the fit method.

Note that this method is only relevant when this estimator is used as a sub-estimator within a meta-estimator and metadata routing is enabled with enable_metadata_routing=True (see sklearn.set_config()). Please check the User Guide on how the routing mechanism works.

The options for each parameter are:

True: metadata is requested, and passed to fit if provided. The request is ignored if metadata is not provided.
False: metadata is not requested and the meta-estimator will not pass it to fit.
None: metadata is not requested, and the meta-estimator will raise an error if the user provides it.
str: metadata should be passed to the meta-estimator with this given alias instead of the original name.

The default (sklearn.utils.metadata_routing.UNCHANGED) retains the existing request. This allows you to change the request for some parameters and not others.

Added in version 1.3.

Parameters#

eval_setstr, True, False, or None, default=sklearn.utils.metadata_routing.UNCHANGED: Metadata routing for eval_set parameter in fit.
verbosestr, True, False, or None, default=sklearn.utils.metadata_routing.UNCHANGED: Metadata routing for verbose parameter in fit.

Returns#

selfobject: The updated object.

set_predict_request(*, x='$UNCHANGED$')#

Configure whether metadata should be requested to be passed to the predict method.

Note that this method is only relevant when this estimator is used as a sub-estimator within a meta-estimator and metadata routing is enabled with enable_metadata_routing=True (see sklearn.set_config()). Please check the User Guide on how the routing mechanism works.

The options for each parameter are:

True: metadata is requested, and passed to predict if provided. The request is ignored if metadata is not provided.
False: metadata is not requested and the meta-estimator will not pass it to predict.
None: metadata is not requested, and the meta-estimator will raise an error if the user provides it.
str: metadata should be passed to the meta-estimator with this given alias instead of the original name.

The default (sklearn.utils.metadata_routing.UNCHANGED) retains the existing request. This allows you to change the request for some parameters and not others.

Added in version 1.3.

Parameters#

xstr, True, False, or None, default=sklearn.utils.metadata_routing.UNCHANGED: Metadata routing for x parameter in predict.

Returns#

selfobject: The updated object.

set_score_request(*, sample_weight='$UNCHANGED$')#

Configure whether metadata should be requested to be passed to the score method.

Note that this method is only relevant when this estimator is used as a sub-estimator within a meta-estimator and metadata routing is enabled with enable_metadata_routing=True (see sklearn.set_config()). Please check the User Guide on how the routing mechanism works.

The options for each parameter are:

True: metadata is requested, and passed to score if provided. The request is ignored if metadata is not provided.
False: metadata is not requested and the meta-estimator will not pass it to score.
None: metadata is not requested, and the meta-estimator will raise an error if the user provides it.
str: metadata should be passed to the meta-estimator with this given alias instead of the original name.

The default (sklearn.utils.metadata_routing.UNCHANGED) retains the existing request. This allows you to change the request for some parameters and not others.

Added in version 1.3.

Parameters#

sample_weightstr, True, False, or None, default=sklearn.utils.metadata_routing.UNCHANGED: Metadata routing for sample_weight parameter in score.

Returns#

selfobject: The updated object.

update(trial)#

Update the Linear Regression model using hyperparameter suggestions.

Parameters:: trial – An Optuna trial object used to sample hyperparameters.

The model is re-instantiated with new parameters obtained from the configuration’s search space.

Machine Learning Models#

BaseRegressor Interface#

Constructor#

Key Methods#

API Reference#

Parameters#

Returns#

Parameters#

Returns#

Parameters#

Returns#

Point Forecast Models#

CATBOOSTModel#

CATBOOSTConfig#

XGBOOSTModel#

XGBOOSTConfig#

LIGHTGBMModel#

LIGHTGBMConfig#

LINEAREGModel#

LINEAREGConfig#

RANDOMFORESTModel#

RANDOMFORESTConfig#

Probabilistic ML Models#

Model Comparison#

Usage Example#

Custom Search Space Example#

Class Reference#

Autodoc: Base Classes#

Parameters#

Returns#

Parameters#

Returns#

Parameters#

Returns#

Parameters#

Returns#

Parameters#

Returns#

Parameters#

Returns#

Autodoc: Point Forecast Models#

Parameters#

Returns#

Parameters#

Returns#

Parameters#

Returns#

Parameters#

Returns#

Parameters#

Returns#

Parameters#

Returns#

Parameters#

Returns#

Parameters#

Returns#

Parameters#

Returns#

Parameters#

Returns#

Parameters#

Returns#

Parameters#

Returns#

Parameters#

Returns#

Parameters#

Returns#

Parameters#

Returns#

See Also#