Baseline Models#
Source Files
twiga/models/baseline/naive_model.py-NAIVEModel/NAIVEConfigtwiga/models/baseline/seasonal_naive_model.py-SEASONALNAIVEModel/SEASONALNAIVEConfigtwiga/models/baseline/window_average_model.py-WINDOWAVERAGEModel/WINDOWAVERAGEConfigtwiga/models/baseline/drift_model.py-DRIFTModel/DRIFTConfig
Twiga’s baseline domain provides four classical reference models that require no training data beyond shape information. They serve as lower-bound benchmarks when evaluating learned models, as inputs to skill score calculations, and as lightweight forecast components for rapid prototyping. All baseline models share the same interface as ML and NN models and can be composed freely with TwigaForecaster.
For the full model catalogue see the Model Catalog Overview.
When to use baseline models
Use case |
Recommended model |
|---|---|
Standard persistence benchmark |
|
Fixed-value reference (e.g. training mean) |
|
Daily or weekly seasonality benchmark |
|
Level-adapted mean baseline |
|
Trending signal benchmark |
|
Skill score denominator |
Any of the above, chosen to match the naive reference in your domain |
Baseline models are especially valuable in energy forecasting: a well-tuned seasonal naive forecast is a strong competitor for load, and any learned model that does not clearly outperform it on skill score should be scrutinised before production use.
Common Interface#
All four baseline models extend BaseRegressor and override its key methods with statistic-only logic. There are no trainable parameters.
Constructor Pattern#
Model(model_config: Config | None = None)
Each model defaults to its own config class when model_config is omitted.
NAIVEModel#
NAIVEModel implements four persistence strategies, ranging from the classical per-window naive forecast ("window_last") to a fixed zero reference ("zero"). It is the simplest and most commonly used baseline.
Prediction Strategies#
Strategy |
Formula |
Description |
|---|---|---|
|
|
True persistence: repeats the last observed value in each input window across all forecast steps. |
|
|
Uses the very last value seen during training, constant across all test samples and steps. |
|
|
Per-output training-set mean, constant across all test samples and steps. |
|
|
Predicts zero everywhere. Useful for centred or differenced targets. |
The "window_last" and "window_average" strategies adapt to the local level of each test window; "last" and "mean" are global constants derived from the training set.
NAIVEConfig#
Field |
Type |
Default |
Description |
|---|---|---|---|
|
|
|
Model identifier. Excluded from parameter dumps. |
|
|
|
Domain identifier. Excluded from parameter dumps. |
|
|
|
Prediction strategy. |
|
|
(see below) |
Default hyperparameter search space. Excluded from parameter dumps. |
Default Search Space:
Parameter |
Values |
Type |
|---|---|---|
|
|
categorical |
Predict Logic#
For "window_last":
last_obs = x[:, -1, :num_targets] # (B, num_targets)
preds = broadcast(last_obs, (B, horizon, num_targets))
For "last", "mean", and "zero", a constant vector stored at fit time is broadcast to all samples and horizon steps.
API Reference#
- class twiga.models.baseline.naive_model.NAIVEModel(model_config=None)#
Bases:
BaseRegressorNaive baseline for point multi-horizon forecasting.
Implements four persistence strategies that serve as reference baselines for benchmarking learned models. The model has no trainable parameters;
fitonly stores summary statistics from the training targets.For the
"window_last"strategy the model is a true per-window persistence forecast: ŷ_{t+h} = y_t for all h, using the most recently observed target values from the input context.The
eval_setargument infit()is accepted for API compatibility and silently ignored.- Parameters:
model_config (
NAIVEConfig|None) – Configuration object. Defaults toNAIVEConfig(strategy"window_last").
Example:
model = NAIVEModel() model.fit(X_train, y_train) preds = model.predict(X_test) # shape (B, L, H)
- fit(X, y, eval_set=None, verbose=False)#
Compute and store the training-set constant.
- Parameters:
- Return type:
- Returns:
Self for method chaining.
- property min_seq_len: int#
Minimum lookback window length required by this model.
All naive strategies work with any L ≥ 1, so this always returns
1. Provided for API consistency with other baseline models.
- predict(x)#
Return naive predictions for the input batch.
- Parameters:
x (
ndarray) – Shape(B, L, F).- Return type:
- Returns:
Predictions of shape
(B, L, H).- Raises:
ValueError – If the model has not been fitted or x is not 3-D.
- set_fit_request(*, eval_set='$UNCHANGED$', verbose='$UNCHANGED$')#
Configure whether metadata should be requested to be passed to the
fitmethod.Note that this method is only relevant when this estimator is used as a sub-estimator within a meta-estimator and metadata routing is enabled with
enable_metadata_routing=True(seesklearn.set_config()). Please check the User Guide on how the routing mechanism works.The options for each parameter are:
True: metadata is requested, and passed tofitif provided. The request is ignored if metadata is not provided.False: metadata is not requested and the meta-estimator will not pass it tofit.None: metadata is not requested, and the meta-estimator will raise an error if the user provides it.str: metadata should be passed to the meta-estimator with this given alias instead of the original name.
The default (
sklearn.utils.metadata_routing.UNCHANGED) retains the existing request. This allows you to change the request for some parameters and not others.Added in version 1.3.
Parameters#
- eval_setstr, True, False, or None, default=sklearn.utils.metadata_routing.UNCHANGED
Metadata routing for
eval_setparameter infit.- verbosestr, True, False, or None, default=sklearn.utils.metadata_routing.UNCHANGED
Metadata routing for
verboseparameter infit.
Returns#
- selfobject
The updated object.
- set_predict_request(*, x='$UNCHANGED$')#
Configure whether metadata should be requested to be passed to the
predictmethod.Note that this method is only relevant when this estimator is used as a sub-estimator within a meta-estimator and metadata routing is enabled with
enable_metadata_routing=True(seesklearn.set_config()). Please check the User Guide on how the routing mechanism works.The options for each parameter are:
True: metadata is requested, and passed topredictif provided. The request is ignored if metadata is not provided.False: metadata is not requested and the meta-estimator will not pass it topredict.None: metadata is not requested, and the meta-estimator will raise an error if the user provides it.str: metadata should be passed to the meta-estimator with this given alias instead of the original name.
The default (
sklearn.utils.metadata_routing.UNCHANGED) retains the existing request. This allows you to change the request for some parameters and not others.Added in version 1.3.
Parameters#
- xstr, True, False, or None, default=sklearn.utils.metadata_routing.UNCHANGED
Metadata routing for
xparameter inpredict.
Returns#
- selfobject
The updated object.
- set_score_request(*, sample_weight='$UNCHANGED$')#
Configure whether metadata should be requested to be passed to the
scoremethod.Note that this method is only relevant when this estimator is used as a sub-estimator within a meta-estimator and metadata routing is enabled with
enable_metadata_routing=True(seesklearn.set_config()). Please check the User Guide on how the routing mechanism works.The options for each parameter are:
True: metadata is requested, and passed toscoreif provided. The request is ignored if metadata is not provided.False: metadata is not requested and the meta-estimator will not pass it toscore.None: metadata is not requested, and the meta-estimator will raise an error if the user provides it.str: metadata should be passed to the meta-estimator with this given alias instead of the original name.
The default (
sklearn.utils.metadata_routing.UNCHANGED) retains the existing request. This allows you to change the request for some parameters and not others.Added in version 1.3.
Parameters#
- sample_weightstr, True, False, or None, default=sklearn.utils.metadata_routing.UNCHANGED
Metadata routing for
sample_weightparameter inscore.
Returns#
- selfobject
The updated object.
SEASONALNAIVEModel#
SEASONALNAIVEModel implements the classical seasonal naive forecast. For each forecast step h, the prediction is the value observed exactly m steps before that step in the input window, where m is the resolved seasonal period. For forecast horizons longer than one season (H > m), the pattern wraps modularly.
Period Resolution#
The period parameter accepts either a raw integer (number of data steps) or a pandas duration string, which is resolved against the freq sampling frequency via _resolve_period(period, freq).
|
|
Resolved steps |
Interpretation |
|---|---|---|---|
|
|
24 |
Daily periodicity, hourly data |
|
|
48 |
Daily periodicity, 30-min data |
|
|
168 |
Weekly periodicity, hourly data |
|
|
336 |
Weekly periodicity, 30-min data |
|
(ignored) |
48 |
Direct integer specification |
_resolve_period raises ValueError if period is not a whole multiple of freq, or if the resolved step count is not positive.
SEASONALNAIVEConfig#
Field |
Type |
Default |
Description |
|---|---|---|---|
|
|
|
Model identifier. Excluded from parameter dumps. |
|
|
|
Domain identifier. Excluded from parameter dumps. |
|
|
|
Seasonal period: positive integer or pandas duration string. |
|
|
|
Data sampling frequency, used to resolve string periods. |
|
|
(see below) |
Default hyperparameter search space. Excluded from parameter dumps. |
Default Search Space:
Parameter |
Values |
Type |
|---|---|---|
|
|
categorical |
Fit Validation#
fit() resolves and stores the period in steps, then validates:
seq_len >= period_steps
A ValueError is raised if the lookback window is shorter than the resolved period. Increase lookback_window_size in DataPipelineConfig or reduce the period to fix this.
Predict Logic#
For each forecast step h (0-indexed), the source index into the input window is:
indices = [seq_len - m + (h % m) for h in range(horizon)]
preds = x[:, indices, :num_targets].copy() # (B, horizon, num_targets)
The modular wrap h % m ensures that when horizon > m, the seasonal pattern repeats correctly.
flowchart LR
X["x: (B, L, F)"] --> IDX["indices: [L-m, L-m+1, ..., L-1,\n L-m, L-m+1, ...]"]
IDX --> GATHER["gather x[:, indices, :T]"]
GATHER --> PRED["preds: (B, H, T)"]
API Reference#
- class twiga.models.baseline.seasonal_naive_model.SEASONALNAIVEModel(model_config=None)#
Bases:
BaseRegressorSeasonal naïve baseline for multi-horizon forecasting.
Repeats the value observed exactly m data steps before each forecast step, where m is the resolved seasonal period. For forecast horizons longer than one season, the seasonal pattern wraps.
The model has no trainable parameters.
fitvalidates the sequence length and resolves the period; all inference work happens inpredict.- Parameters:
model_config (
SEASONALNAIVEConfig|None) – Configuration object. Defaults toSEASONALNAIVEConfig(period="1D",freq="1H").
Example:
# Daily naive on 30-min data (48 steps) model = SEASONALNAIVEModel(SEASONALNAIVEConfig(period="1D", freq="30min")) model.fit(X_train, y_train) preds = model.predict(X_test) # (B, L, H) # Weekly naive on hourly data model = SEASONALNAIVEModel(SEASONALNAIVEConfig(period="7D", freq="1H"))
- fit(X, y, eval_set=None, verbose=False)#
Resolve the seasonal period and validate the sequence length.
- Parameters:
- Return type:
- Returns:
Self for method chaining.
- Raises:
ValueError – If the resolved period exceeds the sequence length
L, or ifXoryare not 3-dimensional.
- property min_seq_len: int#
Minimum lookback window length required by the configured period.
Use this to validate your data pipeline’s lookback setting before fitting. Raises
ValueErrorif the period cannot be resolved (e.g. incompatible freq string).Example:
model = SEASONALNAIVEModel(SEASONALNAIVEConfig(period="7D", freq="1h")) assert pipeline_config.seq_len >= model.min_seq_len
- predict(x)#
Return seasonal naïve predictions.
For forecast step h (0-indexed), the prediction is taken from position
x[:, L - m + (h % m), :num_targets]in the input window, where L is the sequence length and m is the period.- Parameters:
x (
ndarray) – Shape(B, L, F). L must be ≥ fitted period.- Return type:
- Returns:
Predictions of shape
(B, L_horizon, H).- Raises:
ValueError – If the model has not been fitted or x is not 3-D.
- set_fit_request(*, eval_set='$UNCHANGED$', verbose='$UNCHANGED$')#
Configure whether metadata should be requested to be passed to the
fitmethod.Note that this method is only relevant when this estimator is used as a sub-estimator within a meta-estimator and metadata routing is enabled with
enable_metadata_routing=True(seesklearn.set_config()). Please check the User Guide on how the routing mechanism works.The options for each parameter are:
True: metadata is requested, and passed tofitif provided. The request is ignored if metadata is not provided.False: metadata is not requested and the meta-estimator will not pass it tofit.None: metadata is not requested, and the meta-estimator will raise an error if the user provides it.str: metadata should be passed to the meta-estimator with this given alias instead of the original name.
The default (
sklearn.utils.metadata_routing.UNCHANGED) retains the existing request. This allows you to change the request for some parameters and not others.Added in version 1.3.
Parameters#
- eval_setstr, True, False, or None, default=sklearn.utils.metadata_routing.UNCHANGED
Metadata routing for
eval_setparameter infit.- verbosestr, True, False, or None, default=sklearn.utils.metadata_routing.UNCHANGED
Metadata routing for
verboseparameter infit.
Returns#
- selfobject
The updated object.
- set_predict_request(*, x='$UNCHANGED$')#
Configure whether metadata should be requested to be passed to the
predictmethod.Note that this method is only relevant when this estimator is used as a sub-estimator within a meta-estimator and metadata routing is enabled with
enable_metadata_routing=True(seesklearn.set_config()). Please check the User Guide on how the routing mechanism works.The options for each parameter are:
True: metadata is requested, and passed topredictif provided. The request is ignored if metadata is not provided.False: metadata is not requested and the meta-estimator will not pass it topredict.None: metadata is not requested, and the meta-estimator will raise an error if the user provides it.str: metadata should be passed to the meta-estimator with this given alias instead of the original name.
The default (
sklearn.utils.metadata_routing.UNCHANGED) retains the existing request. This allows you to change the request for some parameters and not others.Added in version 1.3.
Parameters#
- xstr, True, False, or None, default=sklearn.utils.metadata_routing.UNCHANGED
Metadata routing for
xparameter inpredict.
Returns#
- selfobject
The updated object.
- set_score_request(*, sample_weight='$UNCHANGED$')#
Configure whether metadata should be requested to be passed to the
scoremethod.Note that this method is only relevant when this estimator is used as a sub-estimator within a meta-estimator and metadata routing is enabled with
enable_metadata_routing=True(seesklearn.set_config()). Please check the User Guide on how the routing mechanism works.The options for each parameter are:
True: metadata is requested, and passed toscoreif provided. The request is ignored if metadata is not provided.False: metadata is not requested and the meta-estimator will not pass it toscore.None: metadata is not requested, and the meta-estimator will raise an error if the user provides it.str: metadata should be passed to the meta-estimator with this given alias instead of the original name.
The default (
sklearn.utils.metadata_routing.UNCHANGED) retains the existing request. This allows you to change the request for some parameters and not others.Added in version 1.3.
Parameters#
- sample_weightstr, True, False, or None, default=sklearn.utils.metadata_routing.UNCHANGED
Metadata routing for
sample_weightparameter inscore.
Returns#
- selfobject
The updated object.
WINDOWAVERAGEModel#
WINDOWAVERAGEModel predicts the local mean of the last window_size observed target steps in the input context, broadcast to all forecast horizon steps. Because it adapts to the current level of each window, it is a stronger baseline than the global NAIVEModel(strategy="mean").
WINDOWAVERAGEConfig#
Field |
Type |
Default |
Description |
|---|---|---|---|
|
|
|
Model identifier. Excluded from parameter dumps. |
|
|
|
Domain identifier. Excluded from parameter dumps. |
|
|
|
Number of most-recent steps to average. |
|
|
(see below) |
Default hyperparameter search space. Excluded from parameter dumps. |
Default Search Space:
Parameter |
Values |
Type |
|---|---|---|
|
|
categorical |
Predict Logic#
w = window_size or seq_len # None → full window
window = x[:, -w:, :num_targets] # (B, w, num_targets)
avg = window.mean(axis=1) # (B, num_targets)
preds = broadcast(avg, (B, horizon, num_targets))
All forecast horizon steps receive the same local mean. fit() validates that window_size <= seq_len when specified.
Choosing window_size
The default search space covers common hourly and sub-hourly granularities: 4h, 8h, 12h, 24h (daily), 48h, 96h, and 168h (weekly). Use Optuna HPO with tune() to select the best window_size for your target signal. For trending signals consider DRIFTModel instead, which captures the local slope.
API Reference#
- class twiga.models.baseline.window_average_model.WINDOWAVERAGEModel(model_config=None)#
Bases:
BaseRegressorWindow-average baseline for multi-horizon forecasting.
Predicts the local mean of the last
window_sizeobserved target values in the input context for all forecast horizon steps. The mean adapts to the current level of each window, unlike a global training mean.- Parameters:
model_config (
WINDOWAVERAGEConfig|None) – Configuration object. Defaults toWINDOWAVERAGEConfig(full-window average).
Example:
# Average over the last 24 steps (1 day at hourly resolution) model = WINDOWAVERAGEModel(WINDOWAVERAGEConfig(window_size=24)) model.fit(X_train, y_train) preds = model.predict(X_test) # (B, L, H)
- fit(X, y, eval_set=None, verbose=False)#
Store target shape information.
- Parameters:
- Return type:
- Returns:
Self for method chaining.
- Raises:
ValueError – If
Xoryare not 3-dimensional, or ifwindow_sizeexceeds the sequence lengthL.
- property min_seq_len: int#
Minimum lookback window length required by the configured window_size.
Returns
window_sizewhen set, or1whenwindow_size=None(full-window average adapts to any L).Example:
model = WINDOWAVERAGEModel(WINDOWAVERAGEConfig(window_size=168)) assert pipeline_config.seq_len >= model.min_seq_len
- predict(x)#
Return window-average predictions.
- Parameters:
x (
ndarray) – Shape(B, L, F).- Return type:
- Returns:
Predictions of shape
(B, L_horizon, H). All horizon steps receive the same local mean value.- Raises:
ValueError – If the model has not been fitted or x is not 3-D.
- set_fit_request(*, eval_set='$UNCHANGED$', verbose='$UNCHANGED$')#
Configure whether metadata should be requested to be passed to the
fitmethod.Note that this method is only relevant when this estimator is used as a sub-estimator within a meta-estimator and metadata routing is enabled with
enable_metadata_routing=True(seesklearn.set_config()). Please check the User Guide on how the routing mechanism works.The options for each parameter are:
True: metadata is requested, and passed tofitif provided. The request is ignored if metadata is not provided.False: metadata is not requested and the meta-estimator will not pass it tofit.None: metadata is not requested, and the meta-estimator will raise an error if the user provides it.str: metadata should be passed to the meta-estimator with this given alias instead of the original name.
The default (
sklearn.utils.metadata_routing.UNCHANGED) retains the existing request. This allows you to change the request for some parameters and not others.Added in version 1.3.
Parameters#
- eval_setstr, True, False, or None, default=sklearn.utils.metadata_routing.UNCHANGED
Metadata routing for
eval_setparameter infit.- verbosestr, True, False, or None, default=sklearn.utils.metadata_routing.UNCHANGED
Metadata routing for
verboseparameter infit.
Returns#
- selfobject
The updated object.
- set_predict_request(*, x='$UNCHANGED$')#
Configure whether metadata should be requested to be passed to the
predictmethod.Note that this method is only relevant when this estimator is used as a sub-estimator within a meta-estimator and metadata routing is enabled with
enable_metadata_routing=True(seesklearn.set_config()). Please check the User Guide on how the routing mechanism works.The options for each parameter are:
True: metadata is requested, and passed topredictif provided. The request is ignored if metadata is not provided.False: metadata is not requested and the meta-estimator will not pass it topredict.None: metadata is not requested, and the meta-estimator will raise an error if the user provides it.str: metadata should be passed to the meta-estimator with this given alias instead of the original name.
The default (
sklearn.utils.metadata_routing.UNCHANGED) retains the existing request. This allows you to change the request for some parameters and not others.Added in version 1.3.
Parameters#
- xstr, True, False, or None, default=sklearn.utils.metadata_routing.UNCHANGED
Metadata routing for
xparameter inpredict.
Returns#
- selfobject
The updated object.
- set_score_request(*, sample_weight='$UNCHANGED$')#
Configure whether metadata should be requested to be passed to the
scoremethod.Note that this method is only relevant when this estimator is used as a sub-estimator within a meta-estimator and metadata routing is enabled with
enable_metadata_routing=True(seesklearn.set_config()). Please check the User Guide on how the routing mechanism works.The options for each parameter are:
True: metadata is requested, and passed toscoreif provided. The request is ignored if metadata is not provided.False: metadata is not requested and the meta-estimator will not pass it toscore.None: metadata is not requested, and the meta-estimator will raise an error if the user provides it.str: metadata should be passed to the meta-estimator with this given alias instead of the original name.
The default (
sklearn.utils.metadata_routing.UNCHANGED) retains the existing request. This allows you to change the request for some parameters and not others.Added in version 1.3.
Parameters#
- sample_weightstr, True, False, or None, default=sklearn.utils.metadata_routing.UNCHANGED
Metadata routing for
sample_weightparameter inscore.
Returns#
- selfobject
The updated object.
DRIFTModel#
DRIFTModel implements the random walk with drift forecast. It computes the per-window linear slope from the first to the last observed value in the input context, then extrapolates that slope forward for all forecast horizon steps.
Formula#
For each sample b and target t:
slope = (x[b, -1, t] − x[b, 0, t]) / max(L − 1, 1)
pred[b, h, t] = x[b, -1, t] + (h + 1) * slope for h = 0, …, H−1
When L = 1, the denominator max(L − 1, 1) = 1, making the slope zero, and the model falls back to pure persistence: ŷ_{t+h} = y_t for all h.
DRIFTConfig#
Field |
Type |
Default |
Description |
|---|---|---|---|
|
|
|
Model identifier. Excluded from parameter dumps. |
|
|
|
Domain identifier. Excluded from parameter dumps. |
|
|
(empty) |
Empty search space - drift has no tunable hyperparameters. |
DRIFTModel.update(trial) is a no-op: it calls get_optuna_params to satisfy the interface but makes no config changes.
Predict Logic#
last = x[:, -1, :num_targets] # (B, num_targets)
first = x[:, 0, :num_targets] # (B, num_targets)
slope = (last - first) / max(L - 1, 1) # (B, num_targets)
steps = np.arange(1, horizon + 1) # (horizon,)
# Vectorised outer product
preds = last[:, np.newaxis, :] + steps[np.newaxis, :, np.newaxis] * slope[:, np.newaxis, :]
# shape: (B, horizon, num_targets)
flowchart LR
X["x: (B, L, F)"] --> FIRST["first = x[:,0,:T]"]
X --> LAST["last = x[:,-1,:T]"]
FIRST --> SLOPE["slope = (last - first) / max(L-1,1)"]
LAST --> SLOPE
SLOPE --> PRED["pred[b,h,t] = last + (h+1) * slope"]
LAST --> PRED
Drift vs. persistence
DRIFTModel is most valuable when the target signal has a consistent local trend within the lookback window (e.g. a ramp-up at dawn for solar irradiance, or a weekday morning load rise). When the signal is mean-reverting or stationary, the drift extrapolation will overshoot, and NAIVEModel or WINDOWAVERAGEModel will typically be more accurate.
API Reference#
- class twiga.models.baseline.drift_model.DRIFTModel(model_config=None)#
Bases:
BaseRegressorDrift (random walk with drift) baseline for multi-horizon forecasting.
Computes the per-window linear slope from the first to the last observed target value in the input context, then extrapolates it forward for all forecast horizon steps.
This is equivalent to StatsForecast’s
RandomWalkWithDrift, applied to each sliding window independently.- Parameters:
model_config (
DRIFTConfig|None) – Configuration object. Defaults toDRIFTConfig.
Example:
model = DRIFTModel() model.fit(X_train, y_train) preds = model.predict(X_test) # (B, L, H)
- fit(X, y, eval_set=None, verbose=False)#
Store target shape information.
- Parameters:
- Return type:
- Returns:
Self for method chaining.
- Raises:
ValueError – If
Xoryare not 3-dimensional.
- property min_seq_len: int#
Minimum lookback window length for a meaningful drift estimate.
Returns
2- the model requires at least a first and a last observation to compute a slope. L=1 is handled gracefully (slope falls back to zero / pure persistence), butmin_seq_lenreflects the minimum for non-degenerate behaviour.
- predict(x)#
Return drift predictions.
For each sample b and target t:
last = x[b, -1, t]first = x[b, 0, t]drift = (last − first) / max(L − 1, 1)pred[b, h, t] = last + (h + 1) · drift
- Parameters:
x (
ndarray) – Shape(B, L, F).- Return type:
- Returns:
Predictions of shape
(B, L_horizon, H).- Raises:
ValueError – If the model has not been fitted or x is not 3-D.
- set_fit_request(*, eval_set='$UNCHANGED$', verbose='$UNCHANGED$')#
Configure whether metadata should be requested to be passed to the
fitmethod.Note that this method is only relevant when this estimator is used as a sub-estimator within a meta-estimator and metadata routing is enabled with
enable_metadata_routing=True(seesklearn.set_config()). Please check the User Guide on how the routing mechanism works.The options for each parameter are:
True: metadata is requested, and passed tofitif provided. The request is ignored if metadata is not provided.False: metadata is not requested and the meta-estimator will not pass it tofit.None: metadata is not requested, and the meta-estimator will raise an error if the user provides it.str: metadata should be passed to the meta-estimator with this given alias instead of the original name.
The default (
sklearn.utils.metadata_routing.UNCHANGED) retains the existing request. This allows you to change the request for some parameters and not others.Added in version 1.3.
Parameters#
- eval_setstr, True, False, or None, default=sklearn.utils.metadata_routing.UNCHANGED
Metadata routing for
eval_setparameter infit.- verbosestr, True, False, or None, default=sklearn.utils.metadata_routing.UNCHANGED
Metadata routing for
verboseparameter infit.
Returns#
- selfobject
The updated object.
- set_predict_request(*, x='$UNCHANGED$')#
Configure whether metadata should be requested to be passed to the
predictmethod.Note that this method is only relevant when this estimator is used as a sub-estimator within a meta-estimator and metadata routing is enabled with
enable_metadata_routing=True(seesklearn.set_config()). Please check the User Guide on how the routing mechanism works.The options for each parameter are:
True: metadata is requested, and passed topredictif provided. The request is ignored if metadata is not provided.False: metadata is not requested and the meta-estimator will not pass it topredict.None: metadata is not requested, and the meta-estimator will raise an error if the user provides it.str: metadata should be passed to the meta-estimator with this given alias instead of the original name.
The default (
sklearn.utils.metadata_routing.UNCHANGED) retains the existing request. This allows you to change the request for some parameters and not others.Added in version 1.3.
Parameters#
- xstr, True, False, or None, default=sklearn.utils.metadata_routing.UNCHANGED
Metadata routing for
xparameter inpredict.
Returns#
- selfobject
The updated object.
- set_score_request(*, sample_weight='$UNCHANGED$')#
Configure whether metadata should be requested to be passed to the
scoremethod.Note that this method is only relevant when this estimator is used as a sub-estimator within a meta-estimator and metadata routing is enabled with
enable_metadata_routing=True(seesklearn.set_config()). Please check the User Guide on how the routing mechanism works.The options for each parameter are:
True: metadata is requested, and passed toscoreif provided. The request is ignored if metadata is not provided.False: metadata is not requested and the meta-estimator will not pass it toscore.None: metadata is not requested, and the meta-estimator will raise an error if the user provides it.str: metadata should be passed to the meta-estimator with this given alias instead of the original name.
The default (
sklearn.utils.metadata_routing.UNCHANGED) retains the existing request. This allows you to change the request for some parameters and not others.Added in version 1.3.
Parameters#
- sample_weightstr, True, False, or None, default=sklearn.utils.metadata_routing.UNCHANGED
Metadata routing for
sample_weightparameter inscore.
Returns#
- selfobject
The updated object.
Model Comparison#
Model |
Config Class |
Hyperparameters |
Adapts to window |
Typical use |
|---|---|---|---|---|
|
|
|
Only |
Universal persistence benchmark |
|
|
|
Yes - per-window seasonal lookup |
Signals with strong daily/weekly periodicity |
|
|
|
Yes - local mean |
Stationary signals with variable level |
|
|
(none) |
Yes - per-window slope |
Trending signals, ramp events |
Usage Example#
The following example shows the complete baseline workflow: train multiple baselines alongside an ML model, evaluate all of them, then compute skill scores.
import pandas as pd
from sklearn.preprocessing import StandardScaler
from twiga.core.config import DataPipelineConfig, ForecasterConfig
from twiga.forecaster.core import TwigaForecaster
from twiga.models.baseline.naive_model import NAIVEConfig
from twiga.models.baseline.seasonal_naive_model import SEASONALNAIVEConfig
from twiga.models.baseline.window_average_model import WINDOWAVERAGEConfig
from twiga.models.baseline.drift_model import DRIFTConfig
from twiga.models.ml.lightgbm_model import LIGHTGBMConfig
# --- Data pipeline ---
data_config = DataPipelineConfig(
target_feature="load_mw",
period="1h",
lookback_window_size=168, # 7 days at hourly resolution
forecast_horizon=24,
calendar_features=["hour", "dayofweek", "month"],
input_scaler=StandardScaler(),
)
# --- Baseline configs ---
naive_config = NAIVEConfig(strategy="window_last")
seasonal_daily = SEASONALNAIVEConfig(period="1D", freq="1h")
seasonal_weekly = SEASONALNAIVEConfig(period="7D", freq="1h")
window_avg_config = WINDOWAVERAGEConfig(window_size=24)
drift_config = DRIFTConfig()
# --- ML model for comparison ---
lgbm_config = LIGHTGBMConfig(boosting_type="gbdt")
# --- Training orchestration ---
train_config = ForecasterConfig(
split_freq="months",
train_size=6,
test_size=1,
window="expanding",
project_name="BaselineBenchmark",
seed=42,
)
# --- Build forecaster with all models ---
forecaster = TwigaForecaster(
data_params=data_config,
model_params=[
naive_config,
seasonal_daily,
seasonal_weekly,
window_avg_config,
drift_config,
lgbm_config,
],
train_params=train_config,
)
data = pd.read_parquet("data/load.parquet")
train_df = data[data.timestamp <= "2024-06-01"]
test_df = data[data.timestamp > "2024-06-01"]
forecaster.fit(train_df=train_df)
results_df, metrics_df = forecaster.evaluate_point_forecast(test_df=test_df)
print(metrics_df)
Tuning Baseline Hyperparameters#
Baselines with search spaces (NAIVEModel, SEASONALNAIVEModel, WINDOWAVERAGEModel) support Optuna-based HPO via tune(). This is useful for selecting the best seasonal period or window size without manual trial-and-error.
# Tune the window size for WINDOWAVERAGEModel
window_forecaster = TwigaForecaster(
data_params=data_config,
model_params=[WINDOWAVERAGEConfig()],
train_params=train_config,
)
window_forecaster.fit(train_df=train_df)
window_forecaster.tune(train_df=train_df, val_df=test_df, num_trials=7)
# Retrain with best window_size
window_forecaster.fit(train_df=train_df)
results_df, metrics_df = window_forecaster.evaluate_point_forecast(test_df=test_df)
Since WINDOWAVERAGEConfig’s default search space has 7 candidates, a full grid search requires only 7 trials.
Skill Score Section#
A skill score measures the relative improvement of a learned model over a naive reference. The standard formulation is:
SS = 1 - (error_model / error_baseline)
Where error is any positively-oriented error metric (MAE, RMSE, MAPE, etc.). A skill score of 0 means the model performs identically to the baseline; 1.0 is a perfect forecast; negative values indicate the model is worse than the baseline.
Computing Skill Scores from Twiga Metrics#
import pandas as pd
# Evaluate all models
results_df, metrics_df = forecaster.evaluate_point_forecast(test_df=test_df)
# metrics_df has index=model_name, columns include "mae", "rmse", etc.
baseline_mae = metrics_df.loc["naive", "mae"]
# Skill score relative to naive persistence
metrics_df["skill_score_mae"] = 1 - (metrics_df["mae"] / baseline_mae)
print(metrics_df[["mae", "skill_score_mae"]].sort_values("skill_score_mae", ascending=False))
Choosing the Reference Baseline#
Signal type |
Recommended reference |
|---|---|
Load (AC or heating) |
|
Solar PV / irradiance |
|
Wind power |
|
Price (day-ahead) |
|
Differenced / centred series |
|
Reporting skill scores
When publishing or reporting forecast results, always state which baseline was used as the reference and over which evaluation period the skill score was computed. A skill score computed against a weak baseline (e.g. "zero") can be misleadingly high; prefer the strongest relevant naive model as the denominator.
See Also#
Machine Learning Models - gradient-boosted and linear models sharing the same interface
TwigaForecaster - training, prediction, and evaluation orchestration
Model Catalog Overview - complete list of all domains and models
Hyperparameter Tuning - Optuna-based search space configuration and tuning workflows