API Reference#

Complete reference for all public classes, functions, and exceptions exported by twiga.

Note

Symbols marked stable follow semantic versioning. Symbols marked experimental may change in minor versions.


Entry Point#

class twiga.forecaster.core.TwigaForecaster(data_params, model_params, train_params, conformal_params=None)

Bases: BaseForecaster

Machine Learning Forecaster for time series predictions.

This forecaster initializes a data pipeline and dynamically loads machine learning models based on provided configurations. The configurations can be specified as Pydantic models or dictionaries. Once the models are loaded, they can be trained, evaluated, and backtested.

Example

>>> from twiga.core.config import BaseModelConfig, DataPipelineConfig, ForecasterConfig
>>> data_params = DataPipelineConfig(date_column="date", ...)
>>> model_config = BaseModelConfig(name="linear", ...)
>>> train_params = ForecasterConfig(...)
>>> forecaster = TwigaForecaster(data_params, model_config, train_params)
>>> forecaster.fit(train_df)
>>> predictions, metrics = forecaster.evaluate_point_forecast(test_df)
__init__(data_params, model_params, train_params, conformal_params=None)

Initialize TwigaForecaster.

Parameters:
  • data_params (DataPipelineConfig) – Configuration for the data pipeline.

  • model_params (BaseModelConfig | list[BaseModelConfig] | dict | list[dict]) – Configuration for the model(s). Can be a single Pydantic config, a dictionary, or a list of either. Neural network configs with unset dims (num_target_feature, forecast_horizon, lookback_window_size equal to 0) are auto-populated from data_params. Base arch configs with a distribution field set are automatically resolved to the corresponding probabilistic variant (e.g. MLPFConfig(distribution='normal') becomes an MLPFNormalConfig).

  • train_params (ForecasterConfig) – Training configuration parameters.

  • conformal_params (ConformalConfig | None) – Optional conformal prediction configuration.

calibrate(calibrate_df=None, covariate_df=None, ensemble_strategy=None, ensemble_weights=None)

Calibrate conformal prediction models using calibration data.

Parameters:
  • calibrate_df (DataFrame | None) – Calibration dataset. If None, uses the stored training data.

  • covariate_df (DataFrame | None) – Optional covariate dataset.

  • ensemble_strategy (str | None) – Strategy for combining model predictions.

  • ensemble_weights (dict[str, float] | None) – Weights for weighted ensemble strategy.

Raises:

ValueError – If conformal_params is not set.

Return type:

None

explain(X, model_idx=0, n_background=100)

Compute SHAP feature attributions for a fitted ML model.

Builds a ShapExplainer for the model at position model_idx in self.models, runs SHAP over X, and returns a ShapResult with values reshaped to (B, L, F) - one attribution per sample, per lookback step, per feature.

Only ML models (domain="ml") are supported. Neural-network models require gradient-based attribution and are not currently handled.

Parameters:
  • X (ndarray) – Feature array of shape (B, L, F) as produced by the data pipeline (e.g. from DataPipeline.transform()).

  • model_idx (int) – Index into self.models of the model to explain. Defaults to 0 (the first / only model).

  • n_background (int) – Number of background samples for LinearExplainer and KernelExplainer. Ignored for tree models.

Return type:

ShapResult

Returns:

ShapResult with –

  • values - SHAP array (B, L, F)

  • feature_names - original F feature names

  • timestep_labels - L lookback labels ('t-L+1''t0')

  • expected_value - SHAP base value (mean prediction)

Raises:

Example

>>> result = forecaster.explain(X_test)
>>> result.plot_importance(top_n=20)
>>> importance = result.mean_importance()

Configuration#

class twiga.core.config.DataPipelineConfig(**data)

Bases: BaseModel

Configuration for a time-series data pipeline.

Captures everything the pipeline needs to know about the raw dataset: which column to forecast, which features are available, how long the lookback and forecast windows are, what scalers to apply, and which lag/rolling-window features to engineer.

Parameters:
  • target_feature (list[str] | str) – Target variable name(s) to forecast.

  • period (str) – Sampling frequency using pandas offset aliases (e.g. "1H", "30min").

  • lookback_window_size (int) – Number of past timesteps fed to the model as input.

  • forecast_horizon (int) – Number of future timesteps to predict.

  • latitude (float | None, optional) – Latitude for day/night feature calculation. Defaults to None.

  • longitude (float | None, optional) – Longitude for day/night feature calculation. Defaults to None.

  • historical_features (list[str] | None, optional) – Features whose future values are unknown (historical context only). Defaults to None.

  • calendar_features (list[str] | None, optional) – Cyclical temporal features derived from the timestamp column. Defaults to None.

  • exogenous_features (list[str] | None, optional) – Features known over the full lookback + forecast horizon. Defaults to None.

  • future_covariates (list[str] | None, optional) – Features known only over the forecast horizon. Defaults to None.

  • input_scaler (object | None, optional) – Scaler applied to input features. Defaults to None.

  • target_scaler (object | None, optional) – Scaler applied to the target variable. Defaults to None.

  • lags (list[int] | None, optional) – Lag intervals in periods for feature engineering. Defaults to None.

  • windows (list[int] | int | None, optional) – Window sizes for rolling statistics. Defaults to None.

  • window_funcs (list[str] | str | None, optional) – Aggregation functions applied to rolling windows (e.g. "mean", "std"). Defaults to None.

  • date_column (str, optional) – Name of the datetime column. Defaults to "timestamp".

calendar_features: list[str] | None
date_column: str
exogenous_features: list[str] | None
forecast_horizon: int
future_covariates: list[str] | None
historical_features: list[str] | None
input_scaler: object | None
lags: list[int] | None
latitude: float | None
longitude: float | None
lookback_window_size: int
model_config: ClassVar[ConfigDict] = {}

Configuration for the model, should be a dictionary conforming to [ConfigDict][pydantic.config.ConfigDict].

period: str
stride: int
target_feature: list[str] | str
target_scaler: object | None
window_funcs: list[str] | str | None
windows: list[int] | int | None
class twiga.core.config.ForecasterConfig(**data)

Bases: BaseModel

Configuration for the forecaster cross-validation runner.

Controls how the time-series is split for evaluation (split frequency, window type, train/test sizes), and holds project-level metadata such as the project name and output file name.

Parameters:
  • domain (Literal["ml"], optional) – Modelling domain identifier. Fixed to "ml"; excluded from parameter tuning. Defaults to "ml".

  • split_freq (str, optional) – Unit for train_size, test_size, and gap. One of "days", "hours", "weeks", "months", "years". Defaults to "months".

  • test_size (int, optional) – Number of split_freq units in each test fold. Defaults to 1.

  • train_size (int, optional) – Number of split_freq units in each training fold (rolling window only). Defaults to 1.

  • gap (int, optional) – Number of split_freq units between the end of the training fold and the start of the test fold. Defaults to 0.

  • stride (int | None, optional) – Step size between consecutive splits in split_freq units. None uses test_size as the stride. Defaults to None.

  • window (Literal["expanding", "rolling"], optional) – Cross-validation window strategy. Defaults to "expanding".

  • num_splits (int | None, optional) – Maximum number of CV splits. None uses all available splits. Defaults to None.

  • project_name (str, optional) – Experiment / project name used for logging and output paths. Defaults to "experiment".

  • file_name (str | None, optional) – Output file name. None auto-generates from the project name. Defaults to None.

  • seed (int, optional) – Random seed for reproducibility. Defaults to 42.

  • date_column (str, optional) – Name of the datetime column in the dataset. Defaults to "timestamp".

  • root_dir (str, optional) – Root directory for output artefacts. Defaults to "../".

  • metrics (tuple[str] | list[str] | None, optional) – Evaluation metrics to compute and log. None uses the runner’s defaults. Defaults to None.

checkpoints_path: str | None
date_column: str
domain: Literal['ml']
file_name: str | None
gap: int
metrics: tuple[str, ...] | list[str] | None
model_config: ClassVar[ConfigDict] = {}

Configuration for the model, should be a dictionary conforming to [ConfigDict][pydantic.config.ConfigDict].

num_splits: int | None
project_name: str
root_dir: str
seed: int
split_freq: Literal['days', 'minutes', 'hours', 'weeks', 'months', 'years']
stride: int | None
test_size: int
train_size: int
window: Literal['expanding', 'rolling']
class twiga.core.config.BaseModelConfig(**data)

Bases: BaseModel

Shared base configuration for all forecasting models.

Provides the name, domain, and search_space fields that every concrete config is expected to expose, along with a uniform get_optuna_params() that merges fixed config values with any search-space suggestions.

Subclass this to define model-specific configurations:

class MyModelConfig(BaseModelConfig):
    name: Literal["my_model"] = Field(default="my_model", exclude=True)
    hidden_size: int = 128
    dropout: float = 0.3
    search_space: BaseSearchSpace = BaseSearchSpace(
        hidden_size=[64, 128, 256],
        dropout=(0.0, 0.5),
    )
Parameters:
  • name (Literal["base_model"], optional) – Model type identifier. Excluded from parameter tuning. Defaults to "base_model".

  • domain (Literal["nn"], optional) – Modelling domain identifier. Excluded from parameter tuning. Defaults to "nn".

  • search_space (BaseSearchSpace | None, optional) – Hyperparameter search space. When set, its fields are merged into the output of get_optuna_params() for HPO. Defaults to None.

domain: Literal['nn']
get_optuna_params(trial)

Return fixed config values merged with Optuna search-space suggestions.

Fixed parameters come from pydantic.BaseModel.model_dump() (with name and search_space excluded). If a search_space is set, its fields are sampled for trial and override any overlapping fixed values, allowing a single config object to serve both fixed and tuned usage patterns.

Parameters:

trial (Trial) – Active Optuna trial.

Return type:

dict[str, Any]

Returns:

dict[str, Any]

Combined parameter dict ready to pass to the model

constructor.

model_config: ClassVar[ConfigDict] = {'extra': 'allow'}

Configuration for the model, should be a dictionary conforming to [ConfigDict][pydantic.config.ConfigDict].

name: Literal['base_model']
search_space: BaseSearchSpace | None
class twiga.core.config.ConformalConfig(**data)

Bases: BaseModel

Configuration for conformal prediction methods.

Supports three conformal predictors - residual-based, quantile-based, and residual-fitting - each with compatible nonconformity score types.

Parameters:
  • method (Literal["residual", "quantile", "residual-fitting"], optional) –

    Conformal prediction method:

    • "residual" - nonconformity scores based on absolute residuals |y - ŷ|.

    • "quantile" - quantile regression for prediction intervals.

    • "residual-fitting" - fits a secondary model to predict residuals for adaptive interval widths.

    Defaults to "residual".

  • score_type (str, optional) – Nonconformity score type. "scaled" / "unscaled" for quantile method; "res" / "sign-res" for residual-based methods. Defaults to "res".

  • alpha (float, optional) – Significance level controlling the confidence level (1 - alpha) of the prediction intervals. Must be in (0, 1). For example alpha=0.1 → 90 % coverage. Defaults to 0.1.

Raises:

ValueError – If method="quantile" is combined with a residual score type, or if a residual method is combined with a quantile score type.

Examples

>>> ConformalConfig(method="residual", score_type="res", alpha=0.1)
ConformalConfig(method='residual', score_type='res', alpha=0.1)
>>> ConformalConfig(method="quantile", score_type="scaled", alpha=0.05)
ConformalConfig(method='quantile', score_type='scaled', alpha=0.05)
alpha: Annotated[float, FieldInfo(annotation=NoneType, required=False, default=0.1, description='Significance level for prediction intervals. Controls coverage as (1 - alpha). Example: alpha=0.1 90% prediction intervals.', metadata=[Gt(gt=0.0), Lt(lt=1.0)])]
method: Annotated[Literal['residual', 'quantile', 'residual-fitting'], FieldInfo(annotation=NoneType, required=False, default='residual', description="Conformal prediction method. 'residual': absolute residual scores. 'quantile': quantile regression intervals. 'residual-fitting': secondary model predicts residuals for adaptive widths.")]
model_config: ClassVar[ConfigDict] = {'extra': 'forbid'}

Configuration for the model, should be a dictionary conforming to [ConfigDict][pydantic.config.ConfigDict].

score_type: Annotated[Literal['scaled', 'unscaled', 'res', 'sign-res'], FieldInfo(annotation=NoneType, required=False, default='res', description="Nonconformity score type. 'scaled'/'unscaled': for quantile method. 'res'/'sign-res': for residual-based methods.")]
validate_method_score_compatibility()

Validate that method and score_type are compatible.

Return type:

ConformalConfig

classmethod warn_extreme_alpha(v)

Warn if alpha is likely to produce degenerate intervals.

Return type:

float

class twiga.core.config.NeuralModelConfig(**data)

Bases: BaseModelConfig

Configuration for neural network-based forecasting models.

Extends BaseModelConfig with training infrastructure fields and a shared three-dict HPO system for optimizer, scheduler, and batch-size search. See the module docstring for a full explanation of the search space design.

The optimizer and scheduler are selected via optimizer_type and lr_scheduler_type. Both are captured by save_hyperparameters() in BaseNeuralModel at training time, so they must be declared as fields here.

Optional fine-grained overrides can be supplied via optimizer_params and scheduler_params. When provided they are merged into the corresponding entry of BaseNeuralModel.OPTIMIZERS / BaseNeuralModel.SCHEDULERS, allowing partial overrides (e.g. only lr) without replacing the full dict.

Parameters:
  • name (Literal["neural_model"], optional) – Model type identifier. Defaults to "neural_model".

  • domain (Literal["nn"], optional) – Modelling domain identifier. Defaults to "nn".

  • rich_progress_bar (bool, optional) – Enable rich progress bars. Defaults to True.

  • wandb_logging (bool, optional) – Enable Weights & Biases logging. Defaults to False.

  • drop_last (bool, optional) – Drop the last incomplete batch. Defaults to True.

  • num_workers (int, optional) – DataLoader worker count. Defaults to 8.

  • batch_size (int, optional) – Training batch size. Defaults to 64.

  • pin_memory (bool, optional) – Pin memory for faster GPU transfer. Defaults to True.

  • max_epochs (int, optional) – Maximum training epochs. Defaults to 10.

  • patience (int, optional) – Early-stopping patience in epochs. Defaults to 10.

  • resume_training (bool, optional) – Resume from last checkpoint. Defaults to True.

  • seed (int, optional) – Positive integer random seed. Defaults to 42.

  • metric (Literal["mae", "mse", "smape"], optional) – Validation metric. Defaults to "mae".

  • optimizer_type (Literal[...], optional) – Native torch.optim optimizer. Defaults to "adamw".

  • lr_scheduler_type (Literal[...], optional) – Native torch.optim.lr_scheduler class. Defaults to "multi_step".

  • optimizer_params (dict | None, optional) – Partial override for the selected optimizer’s default params. Defaults to None.

  • scheduler_params (dict | None, optional) – Partial override for the selected scheduler’s default params. Defaults to None.

BASE_TRAINING_SEARCH_SPACE: ClassVar[BaseSearchSpace] = BaseSearchSpace(optimizer_type=['adam', 'adamw'], lr_scheduler_type=['warmup_cosine', 'multi_step', 'reduce_on_plateau'], batch_size=[8, 16, 32, 64])
OPTIMIZER_PARAM_SEARCH: ClassVar[dict[str, BaseSearchSpace]] = {'adam': BaseSearchSpace(lr=(0.0001, 0.01), weight_decay=(1e-07, 0.0001)), 'adamw': BaseSearchSpace(lr=(0.0001, 0.01), weight_decay=(1e-06, 0.001)), 'muon': BaseSearchSpace(lr=(0.001, 0.1), momentum=(0.9, 0.99), ns_steps=[4, 6, 8])}
SCHEDULER_PARAM_SEARCH: ClassVar[dict[str, BaseSearchSpace]] = {'multi_step': BaseSearchSpace(prob_decay_1=(0.3, 0.6), prob_decay_2=(0.7, 0.95), gamma=[0.1, 0.2, 0.5]), 'reduce_on_plateau': BaseSearchSpace(factor=[0.1, 0.2, 0.5], prob_patience=(0.05, 0.2)), 'warmup_cosine': BaseSearchSpace(warmup_epochs=[3, 5, 10], eta_min=(1e-07, 1e-05))}
batch_size: int
domain: Literal['nn']
drop_last: bool
classmethod from_data_config(data_config, **kwargs)

Create a config instance with dimensions derived from a DataPipelineConfig.

Parameters:
  • data_config (DataPipelineConfig) – Pipeline config providing feature counts and sequence dimensions.

  • **kwargs – Additional fields forwarded to the constructor, allowing any field to be overridden at instantiation time.

Returns:

NeuralModelConfig – Populated config instance.

Raises:
  • TypeError – If data_config.target_feature is not str or list[str].

  • AttributeError – If data_config is missing forecast_horizon.

get_optuna_params(trial)

Standard HPO sampling for all neural models.

Combines child-specific architecture parameters with the standardized conditional optimizer and scheduler search space.

Return type:

dict

gradient_clip_val: float | None
lr_scheduler_type: Literal['step', 'multi_step', 'multiplicative', 'exponential', 'constant', 'linear_decay', 'polynomial', 'cosine_annealing', 'cosine_annealing_lr', 'cyclic', 'reduce_on_plateau', 'one_cycle', 'warmup_multi_step', 'warmup_cosine']
max_epochs: int
metric: Literal['mae', 'mse', 'smape']
model_config: ClassVar[ConfigDict] = {'extra': 'allow'}

Configuration for the model, should be a dictionary conforming to [ConfigDict][pydantic.config.ConfigDict].

monitor: Literal['loss', 'mae', 'mse', 'smape', 'sigma_loss'] | None
name: Literal['neural_model']
num_workers: int
optimizer_params: dict | None
optimizer_type: Literal['adam', 'adamw', 'nadam', 'radam', 'adamax', 'adafactor', 'adagrad', 'adadelta', 'rmsprop', 'rprop', 'asgd', 'sgd', 'muon']
patience: int
pin_memory: bool
resume_training: bool
rich_progress_bar: bool
classmethod sample_training_params(trial)

Sample optimizer, scheduler, and batch-size using BaseSearchSpace logic.

Return type:

dict

scheduler_params: dict | None
seed: int
wandb_logging: bool
class twiga.core.config.BaseSearchSpace(**data)

Bases: BaseModel

Pydantic model for validating hyperparameter optimisation search spaces.

Each field must be either:

  • A tuple[float, float] or tuple[int, int] representing a continuous range (low, high). Float ranges spanning more than one order of magnitude (high / low >= 10) are sampled on a log scale automatically.

  • A list of at least one categorical value.

The class uses extra="allow" so that concrete search spaces can be defined inline without subclassing:

space = BaseSearchSpace(
    latent_size=[64, 128, 256],
    dropout=(0.0, 0.5),
)
Parameters:

**kwargs – Any keyword argument whose value is a valid range tuple or categorical list.

Examples

>>> space = BaseSearchSpace(lr=(1e-4, 1e-2), activation=["relu", "tanh"])
>>> params = space.get_optuna_params(trial, prefix="mlp")
get_optuna_params(trial, prefix='')

Generate Optuna parameter suggestions for all fields.

Parameters:
  • trial (Trial) – Active Optuna trial.

  • prefix (str) – Prefix prepended to each parameter name in the trial (e.g. the model name) to avoid collisions when multiple search spaces are sampled in the same trial. Defaults to "".

Return type:

dict[str, Any]

Returns:

dict[str, Any]

Mapping of field names (without prefix) to their

sampled values.

model_config: ClassVar[ConfigDict] = {'extra': 'allow'}

Configuration for the model, should be a dictionary conforming to [ConfigDict][pydantic.config.ConfigDict].

validate_against(config)

Raise ValueError if any search space field name is not present on config.

Catches typos in search space definitions early - before an Optuna trial is run - so that mis-spelled field names produce a clear error instead of silently sampling a parameter that never gets applied.

Parameters:

config (BaseModel) – The model config instance (or class) whose fields define the valid parameter names.

Raises:

ValueError – If one or more field names in this search space do not exist on config.

Examples

Return type:

None

>>> space = BaseSearchSpace(hiddn_dim=[64, 128])  # typo!
>>> space.validate_against(my_model_config)
Traceback (most recent call last):
    ...
ValueError: Search space contains unknown fields: {'hiddn_dim'}. ...
validate_search_space()

Validate all fields have valid types and structure.

Return type:

BaseSearchSpace


Registry#

twiga.forecaster.registry.get_model(name, domain=None)

Lazily load the model and config classes from models/ml/ or models/nn/.

Parameters:
  • name (str) – The name of the model (e.g., “linear”, “lstm”).

  • domain (str | None) – The specific domain to look in (“ml” or “nn”). If None, searches both.

Return type:

tuple[type, type]

Returns:

tuple[Type, Type] – A tuple of (model_class, config_class).

Raises:

ValueError – If the model is not found in the specified or default domains.


Evaluation#

twiga.core.metrics.point.evaluate_point_forecast(result, metric_names=None, axis=1)

Evaluate point forecasts by computing daily pointwise metrics.

Parameters:
  • result (ForecastResult) – ForecastResult with ground_truth set, kind=ForecastKind.POINT.

  • metric_names (list[str] | None) – Metric names to compute. When None all supported point metrics are computed.

  • axis (int | None) – Axis along which to compute aggregate metrics. If None, metrics that require an axis will use their default behavior.

Return type:

DataFrame

Returns:

DataFrame of per-day, per-target metrics indexed by daily timestamp.

twiga.core.metrics.interval.evaluate_interval_forecast(result, alpha=0.01, true_nmpi=None, spread='iqr', nmpi_scale='range', axis=1, metric_names=None)

Evaluate interval forecasts by computing daily point and interval metrics.

Parameters:
  • result (ForecastResult) – ForecastResult with ground_truth, lower, and upper set, kind=ForecastKind.INTERVAL.

  • alpha (float) – Significance level used for Winkler score and coverage computations. Must be in (0, 1). Defaults to 0.01.

  • true_nmpi (float | None) – Override for κ — absolute spread of the target used as the CWE reference numerator. When None, derived from spread.

  • spread (Literal['iqr', 'mad', 'std']) – Spread measure for the CWE reference κ. "iqr" (default), "mad", or "std". See get_interval_metrics().

  • nmpi_scale (Literal['range', 'max', 'mean', 'median']) – Denominator R for NMPI and κ/R. "range" (default), "max", "mean", or "median".

  • axis (int | None) – Axis along which to compute aggregate metrics.

  • metric_names (list[str] | None) – List of interval metric names to compute.

Return type:

DataFrame

Returns:

DataFrame of per-day, per-target point and interval metrics indexed by daily timestamp.


Forecast Results (experimental)#

class twiga.forecaster.result.ForecastResult(timestamps, loc, targets, model_name, kind, ground_truth=None, scale=None, quantiles=None, quantile_levels=None, conf_level=None, samples=None, lower=None, upper=None, inference_time=0.0)

Bases: object

Container for one model’s forecast output.

Variables:
  • timestamps – shape (n_batch, n_horizon, n_targets)

  • loc – point predictions (mean/median), shape (n_batch, n_horizon, n_targets)

  • targets – ordered list of target variable names

  • model_name – human-readable model identifier

  • kind – determines which optional arrays are expected and how to convert

  • ground_truth – optional, same shape as loc

  • scale – parametric std-dev / scale, same shape as loc

  • quantiles – shape (n_batch, n_q, n_horizon, n_targets)

  • quantile_levels – corresponding probability levels (e.g. [0.1, 0.5, 0.9])

  • samples – shape (n_batch, n_samples, n_horizon, n_targets)

  • lower – lower bound, same shape as loc

  • upper – upper bound, same shape as loc

  • inference_time – inference duration in seconds

  • conf_level

  • metric_name

conf_level: list[float] | ndarray | None = None
evaluate(ground_truth=None, **kwargs)

Evaluate forecast against ground truth using kind-appropriate metrics.

Forwards to twiga.core.metrics.evaluate_forecast().

Parameters:
  • ground_truth (ndarray | None) – shape (n_batch, n_horizon, n_targets). When omitted the ground_truth stored on the result is used.

  • **kwargs – forwarded to the underlying evaluate function.

Return type:

DataFrame

Returns:

DataFrame of per-day, per-target metrics.

Raises:

ValueError – if no ground truth is available.

ground_truth: ndarray | None = None
inference_time: float = 0.0
kind: ForecastKind
loc: ndarray
lower: ndarray | None = None
model_name: str
quantile_levels: list[float] | ndarray | None = None
quantiles: ndarray | None = None
samples: ndarray | None = None
scale: ndarray | None = None
targets: list[str]
timestamps: ndarray
to_dataframe(fmt='long')

Convert forecast to tidy DataFrame.

Always includes: timestamp, target, model, forecast. Optional: actual (when ground_truth is present).

Additional columns depend on forecast kind:

  • POINT: no extra columns

  • PARAMETRIC: scale

  • INTERVAL: lower, upper

  • QUANTILE (fmt=”wide”): q_0.10, q_0.50, …

  • QUANTILE (fmt=”long”): q_level, quantile_forecast

  • SAMPLES: q_0.10, q_0.50, q_0.90 (empirical quantiles)

Parameters:

fmt (str) – “long” (default) or “wide” - only affects QUANTILE

Return type:

DataFrame

Returns:

pandas DataFrame in long or wide format

Raises:

ValueError – if fmt is invalid

upper: ndarray | None = None
class twiga.forecaster.result.ForecastCollection(results=<factory>)

Bases: object

Collection of ForecastResult objects from multiple models.

add(result)

Add or replace result using its model_name as key.

Return type:

None

evaluate(**kwargs)

Evaluate all models and return a combined metrics DataFrame.

Calls ForecastResult.evaluate() on each result and concatenates the output, adding a "Model" column derived from each result’s model_name. Ground truth must be attached to each result (i.e. forecast() must have been called with test data that contains the target column).

Parameters:

**kwargs – Forwarded to each ForecastResult.evaluate() call (e.g. metric_names, freq).

Return type:

DataFrame

Returns:

Combined metrics DataFrame with a "Model" column.

Raises:

ValueError – If the collection is empty or any result lacks ground truth.

property model_names: list[str]
results: dict[str, ForecastResult]
to_dataframe(fmt='long')

Concatenate all model forecasts into one DataFrame.

Parameters:

fmt (str) – passed to each ForecastResult.to_dataframe()

Return type:

DataFrame

Returns:

Combined long-format DataFrame

Raises:

ValueError – if collection is empty

class twiga.forecaster.result.ForecastKind(*values)

Bases: StrEnum

Supported forecast output types.

Values are strings and can be used directly as dict keys.

INTERVAL = 'interval'
PARAMETRIC = 'parametric'
POINT = 'point'
QUANTILE = 'quantile'
SAMPLES = 'samples'

Ensemble (experimental)#

twiga.forecaster.ensemble.compute_ensemble_predictions(predictions, model_names, ensemble_strategy, ensemble_weights=None)

Generate ensemble predictions by combining predictions from multiple models.

Parameters:
  • predictions (list[ndarray]) – List of model predictions, where each prediction is a 3D NumPy array with shape (num_samples, horizon, num_targets).

  • model_names (list[str]) – List of model names corresponding to the predictions.

  • ensemble_strategy (EnsembleStrategy) – Strategy for combining predictions, one of EnsembleStrategy.MEAN, EnsembleStrategy.MEDIAN, or EnsembleStrategy.WEIGHTED.

  • ensemble_weights (dict[str, float] | None) – Dictionary mapping model names to their weights for the weighted ensemble strategy. Required if ensemble_strategy is EnsembleStrategy.WEIGHTED. Defaults to None.

Return type:

ndarray

Returns:

A 3D NumPy array of ensemble predictions with shape (num_samples, horizon, num_targets).

Raises:

ValueError – If predictions is empty, prediction shapes are inconsistent, weights are required but not provided, the number of weights does not match the number of models, or the ensemble strategy is unknown.


Exceptions#

exception twiga.core.exceptions.TwigaError#

Bases: Exception

Base class for all twiga library exceptions.

exception twiga.core.exceptions.ConfigurationError#

Bases: TwigaError, ValueError

Raised when a configuration is invalid or incompatible.

exception twiga.core.exceptions.MissingExtraError#

Bases: TwigaError, ImportError

Raised when an optional dependency is not installed.

exception twiga.core.exceptions.NotFittedError#

Bases: TwigaError, RuntimeError

Raised when a model or pipeline is used before fitting.

exception twiga.core.exceptions.PipelineError#

Bases: TwigaError, RuntimeError

Raised for errors in the data pipeline.

twiga.core.exceptions.require_extra(package, extra)#

Raise a helpful ImportError if an optional dependency is missing.

Parameters:
  • package (str) – The Python package to check (e.g. "shap").

  • extra (str) – The twiga extras group that provides it (e.g. "explain").

Raises:

MissingExtraError – If package cannot be imported.

Return type:

None

Example

>>> require_extra("shap", "explain")

Logging#

twiga.core.utils.configure(level='INFO', *, colour=True, log_file=None, file_level='DEBUG', capture_warnings=True)#

Activate Twiga logging. Call once from user code or experiment scripts.

Sets up a console handler (optionally colour-coded) and an optional file handler. Safe to call multiple times - existing handlers are cleared before new ones are attached.

Parameters:
  • level (str | int) – Console log level. Accepts level names ("DEBUG", "INFO", …) or integer constants (logging.DEBUG, …). Defaults to "INFO".

  • colour (bool) – Enable ANSI colour in console output. Automatically disabled when stdout is not a TTY (e.g. CI or redirected output). Defaults to True.

  • log_file (str | Path | None) – Optional path for a plain-text log file. Parent directory is created automatically if it does not exist. Defaults to None.

  • file_level (str | int) – Log level for the file handler. Defaults to "DEBUG" so full detail is always captured on disk even when the console shows only "INFO".

  • capture_warnings (bool) – Route warnings.warn() calls through the logging system. Defaults to True.

Return type:

Logger

Returns:

The configured root Twiga logging.Logger.

Raises:

ValueError – If level or file_level is not a recognised log-level string.

Example:

configure(level="DEBUG", log_file="results/run.log")
twiga.core.utils.get_logger(name)#

Return a named child of the Twiga root logger.

Call once at module level in every Twiga submodule:

log = get_logger(__name__)
Parameters:

name (str) – Dotted module name, typically __name__. Automatically prefixed with "twiga." if not already present.

Return type:

Logger

Returns:

A logging.Logger that inherits handlers from the Twiga root logger.