Getting Started with Twiga#

What you’ll build

A complete point-forecast pipeline that predicts net electrical load 24 hours ahead using a LightGBM model from raw CSV to a styled metric table and forecast plot.

Prerequisites

Basic Python (variables, functions, imports)
Basic pandas (loading a CSV, filtering rows with boolean masks)

Learning objectives

By the end of this notebook you will be able to:

Load and inspect a time series dataset for Twiga
Explain the purpose of each of the three config objects (DataPipelineConfig, ForecasterConfig, model config)
Train a LightGBM forecaster with TwigaForecaster.fit()
Evaluate and interpret point-forecast metrics (MAE, RMSE, Correlation)
Visualise actuals vs. predictions with plot_forecast_grid()

The five-step workflow

flowchart LR
    A["⛁ Raw Data<br>(DataFrame)"]
    B["⚙ Configure<br>(3 configs)"]
    C["🔧 Assemble<br>(Forecaster)"]
    D["▶ Train<br>(.fit)"]
    E["✓ Evaluate<br>(.evaluate)"]

    A --> B --> C --> D --> E

Every tutorial follows this same pattern - later notebooks add probabilistic outputs, conformal calibration, or hyperparameter tuning on top of this core loop.

1. Setup#

# Uncomment and run if you want Plotly / Matplotlib support
# import subprocess, sys
# subprocess.check_call([sys.executable, "-m", "pip", "install", "twiga[plots]"])

import warnings

from great_tables import GT, md
from IPython.display import clear_output
from lets_plot import LetsPlot
import pandas as pd
from sklearn.preprocessing import RobustScaler, StandardScaler

LetsPlot.setup_html()

from twiga import TwigaForecaster
from twiga.core.config import DataPipelineConfig, ForecasterConfig
from twiga.core.plot import (
    dual_line_plot,
    plot_density,
    plot_forecast,
    plot_forecast_grid,
    plot_timeseries,
)
from twiga.core.plot.gt import twiga_report
from twiga.core.utils import configure, get_logger
from twiga.models.ml import LIGHTGBMConfig

configure()
log = get_logger("tutorial")

Load data#

The dataset covers Madeira, Portugal (32.37°N, 16.27°W) at 30-minute resolution from 2019-01-01 to 2020-12-31. Each row is one 30-minute interval.

Column glossary

Column	Unit	Description
`timestamp`	-	Date and time of the measurement
`NetLoad(kW)`	kilowatts	Target - electricity demand minus local renewable generation (solar + wind). This is what we forecast.
`Ghi`	W/m²	Global Horizontal Irradiance - solar radiation reaching a flat surface. Proxy for PV output.
`Temperature`	°C	Ambient air temperature. Correlates with heating/cooling load.

Why net load? Operators schedule generation to cover net load, not gross demand. Forecasting net load directly is more useful than forecasting demand and subtracting renewable output separately.

We keep only the three columns we’ll actually use and remove any duplicate timestamps.

data = pd.read_parquet("../data/MLVS-PT.parquet")
data = data[["timestamp", "NetLoad(kW)", "Ghi", "Temperature"]]
data["timestamp"] = pd.to_datetime(data["timestamp"])
data = data.drop_duplicates(subset="timestamp").reset_index(drop=True)
# Restrict to 2019-2020 to keep tutorial execution fast
data = data[(data["timestamp"] >= "2019-01-01") & (data["timestamp"] <= "2020-12-31")].reset_index(drop=True)

log.info("Shape: %s", data.shape)


from twiga.core.plot.gt import twiga_gt

twiga_gt(
    GT(data.head())
    .tab_header(title=md("**Raw Data Sample**"), subtitle="First 5 rows of MLVS-PT")
    .cols_label(
        timestamp=md("**Timestamp**"),
        **{
            "NetLoad(kW)": md("**NetLoad (kW)**"),
            "Ghi": md("**Ghi (W/m²)**"),
            "Temperature": md("**Temperature (°C)**"),
        },
    )
    .tab_source_note("MLVS-PT dataset · Madeira, Portugal · 30-min resolution"),
    n_rows=5,
)

---------------------------------------------------------------------------
FileNotFoundError                         Traceback (most recent call last)
Cell In[3], line 1
----> 1 data = pd.read_parquet("../data/MLVS-PT.parquet")
data = data[["timestamp", "NetLoad(kW)", "Ghi", "Temperature"]]
data["timestamp"] = pd.to_datetime(data["timestamp"])
data = data.drop_duplicates(subset="timestamp").reset_index(drop=True)

File ~/work/twiga-forecast/twiga-forecast/.venv/lib/python3.12/site-packages/pandas/io/parquet.py:669, in read_parquet(path, engine, columns, storage_options, use_nullable_dtypes, dtype_backend, filesystem, filters, **kwargs)
   use_nullable_dtypes = False
check_dtype_backend(dtype_backend)
--> 669 return impl.read(
   path,
   columns=columns,
   filters=filters,
   storage_options=storage_options,
   use_nullable_dtypes=use_nullable_dtypes,
   dtype_backend=dtype_backend,
   filesystem=filesystem,
   **kwargs,
)

File ~/work/twiga-forecast/twiga-forecast/.venv/lib/python3.12/site-packages/pandas/io/parquet.py:258, in PyArrowImpl.read(self, path, columns, filters, use_nullable_dtypes, dtype_backend, storage_options, filesystem, **kwargs)
if manager == "array":
   to_pandas_kwargs["split_blocks"] = True
--> 258 path_or_handle, handles, filesystem = _get_path_or_handle(
   path,
   filesystem,
   storage_options=storage_options,
   mode="rb",
)
try:
   pa_table = self.api.parquet.read_table(
       path_or_handle,
       columns=columns,
   (...)    270         **kwargs,
   )

File ~/work/twiga-forecast/twiga-forecast/.venv/lib/python3.12/site-packages/pandas/io/parquet.py:141, in _get_path_or_handle(path, fs, storage_options, mode, is_dir)
handles = None
if (
   not fs
   and not is_dir
   (...)    139     # fsspec resources can also point to directories
   # this branch is used for example when reading from non-fsspec URLs
--> 141     handles = get_handle(
       path_or_handle, mode, is_text=False, storage_options=storage_options
   )
   fs = None
   path_or_handle = handles.handle

File ~/work/twiga-forecast/twiga-forecast/.venv/lib/python3.12/site-packages/pandas/io/common.py:882, in get_handle(path_or_buf, mode, encoding, compression, memory_map, is_text, errors, storage_options)
       handle = open(
           handle,
           ioargs.mode,
   (...)    878             newline="",
       )
   else:
       # Binary mode
--> 882         handle = open(handle, ioargs.mode)
   handles.append(handle)
# Convert BytesIO or file objects passed with an encoding

FileNotFoundError: [Errno 2] No such file or directory: '../data/MLVS-PT.parquet'

Visualise the raw series#

Before configuring anything, always look at your data. Here we plot all three signals together to check for obvious gaps, outliers, or seasonality.

Things to notice:

NetLoad(kW): strong daily cycle (higher during the day, dips at night) and a weekly pattern (lower on weekends).
Ghi: peaks during daylight hours; near-zero at night - confirms solar irradiance.
Temperature: slower seasonal variation across months.

plot_timeseries melts any number of columns to long format internally and assigns each series a distinct colour from the Twiga palette. n_samples=2000 sub-samples the series so the plot renders quickly.

p = dual_line_plot(
    df=data,
    target_col="NetLoad(kW)",
    exog_col="Ghi",
    target_unit="kW",
    exog_unit="W/m²",
    dataset_name="MLVS-PT",
    n_samples=2000,
)
p

p = plot_timeseries(
    data,
    y_cols=["NetLoad(kW)"],
    date_col="timestamp",
    title="MLVS-PT — Raw signals (2019 – 2020)",
    y_label="Value",
    x_label="Date",
    n_samples=2000,
    fig_size=(820, 280),
)
p

Train / val / test splits#

Key concept - temporal splits

For time series you must never shuffle rows before splitting. A model trained on data from 2021 and tested on 2020 would have seen the future during training - producing unrealistically good scores that collapse on real deployment. Always split in chronological order: train on the past, test on the future.

We use three non-overlapping windows:

Train: the model learns patterns from this data.
Validation: used for early-stopping (prevents overfitting); not seen during final evaluation.
Test: held out completely until evaluation; gives the honest performance estimate.

The code below creates the three DataFrames and prints their shapes so you can confirm there is no overlap.

train_df = data[data["timestamp"] < "2020-01-01"].reset_index(drop=True)
val_df = data[(data["timestamp"] >= "2020-01-01") & (data["timestamp"] < "2020-07-01")].reset_index(drop=True)
test_df = data[data["timestamp"] >= "2020-07-01"].reset_index(drop=True)

# Show split summary as a styled table

split_summary = pd.DataFrame(
    {
        "Split": ["Train", "Validation", "Test"],
        "Start": [
            str(train_df["timestamp"].min().date()),
            str(val_df["timestamp"].min().date()),
            str(test_df["timestamp"].min().date()),
        ],
        "End": [
            str(train_df["timestamp"].max().date()),
            str(val_df["timestamp"].max().date()),
            str(test_df["timestamp"].max().date()),
        ],
        "Rows": [f"{len(train_df):,}", f"{len(val_df):,}", f"{len(test_df):,}"],
        "Duration": ["~23 months", "~6 months", "~6 months"],
        "Purpose": ["Model learning", "Early-stopping / overfitting guard", "Final honest evaluation"],
    }
)

twiga_gt(
    GT(split_summary)
    .tab_header(
        title=md("**Dataset Splits**"),
        subtitle="Chronological — no shuffling, no overlap",
    )
    .cols_label(
        Split=md("**Split**"),
        Start=md("**Start**"),
        End=md("**End**"),
        Rows=md("**Rows**"),
        Duration=md("**Duration**"),
        Purpose=md("**Purpose**"),
    )
    .tab_source_note("MLVS-PT dataset · Madeira, Portugal · 30-min resolution"),
    n_rows=len(split_summary),
)

2. Configure the data pipeline#

DataPipelineConfig describes what you want to forecast and how to build the input features. Think of it as a blueprint - no data is touched until you call fit().

It needs three groups of settings:

① Problem definition - which column is the target, how often is data recorded, and where is the sensor located (latitude/longitude are used to compute solar angles).

② Feature engineering - what extra signals to create automatically:

calendar_features - time-of-day ("hour"), day/night flag ("day_night"), etc.
exogenous_features - columns already in your DataFrame that the model can use as inputs.

③ Sequence lengths - how many past steps to feed the model (lookback_window_size) and how many future steps to predict (forecast_horizon).

Concrete time interpretation at 30-min resolution

Parameter

Value

Real time

lookback_window_size

96

48 hours of history

forecast_horizon

48

24 hours ahead

Finally, input_scaler normalises the features (zero mean, unit variance) and target_scaler normalises the target - both are inverted automatically at predict time.

data_config = DataPipelineConfig(
    # 1. Problem definition
    target_feature="NetLoad(kW)",
    period="30min",
    latitude=32.371666,
    longitude=-16.274998,
    # 2. Feature engineering
    calendar_features=["hour", "day_night"],
    exogenous_features=["Ghi"],
    # 3. Sequence lengths  (48 steps = 24 h at 30-min resolution)
    forecast_horizon=48,
    lookback_window_size=96,
    stride=48,
    # Scalers
    input_scaler=StandardScaler(),
    target_scaler=RobustScaler(),
)

data_config.model_dump()

3. Configure training#

The ForecasterConfig serves as the Project Manifest and Validation Orchestrator. It fulfills two critical roles:

Global Project Identity: It establishes the project_name, which acts as a unique reference key. This is essential for:

Reproducibility: Ensuring that experiments are tracked under a consistent ID.
Storage: Directing where checkpoints_path and logs are saved.
Metadata: Mapping the date_column (e.g., ‘timestamp’) so the internal engine knows exactly which dimension represents time.

Backtesting Strategy: As we will explore in Tutorial 06, the core of this configuration defines the Backtesting (Time-Based Cross-Validation) geometry. Parameters like window, split_freq, and test_size determine how the model “walks forward” through history to simulate real-world deployment.

forecaster_config = ForecasterConfig(proproject_name="Getting Started Tutorial")

4. Choose a model#

What is LightGBM?

LightGBM is a gradient-boosted decision tree library developed by Microsoft. It builds an ensemble of shallow trees where each new tree corrects the errors of the previous ones. It is fast, memory-efficient, handles missing values natively, and typically outperforms deep learning on tabular / feature-engineered inputs - making it an excellent first choice for time series forecasting on small to medium datasets.

LIGHTGBMConfig holds all LightGBM hyperparameters. The defaults (n_estimators=1000, num_leaves=64, etc.) work well for a first run - no tuning required.

Tip: once you have a working baseline here, Tutorial 10 shows how to automatically search for better hyperparameters with Optuna.

model_config = LIGHTGBMConfig()
model_config.model_dump()

5. Assemble the forecaster#

TwigaForecaster is the main entry point. Pass the three configs and it wires everything together: it initialises the data pipeline, loads the model from the registry, and prepares the cross-validation schedule.

Nothing is trained yet - assembly is cheap.

forecaster = TwigaForecaster(
    data_params=data_config,
    model_params=[model_config],
    train_params=forecaster_config,
)

6. Train#

fit() does three things in sequence:

Feature engineering - builds calendar columns, lag features, rolling windows, etc. from data_config.
Scaling - normalises inputs and target using the configured scalers.
Model training - trains a LightGBM model for each horizon step; val_df is used for early-stopping to avoid overfitting.

Training with clear_output() suppresses the verbose LightGBM logs - remove it if you want to watch training progress.

forecaster.fit(train_df=train_df, val_df=val_df)

7. Evaluate#

evaluate_point_forecast() runs the rolling-window CV on test_df using the schedule from ForecasterConfig and returns two DataFrames:

pred: one row per timestep: timestamp, Actual, forecast, Model, fold. Use this for plotting or custom metric computation.
metric: one row per fold: MAE, RMSE, Pearson correlation, WMAPE, SMAPE, NBIAS. Use this for summarising performance.

ensemble_strategy="mean" averages predictions across folds before computing metrics (only relevant when you have multiple models registered).

pred, metric = forecaster.evaluate_point_forecast(test_df=test_df, ensemble_strategy="mean")
# clear_output()
log.info("Evaluation complete.")
GT(pred.head())

Metric summary table#

twiga_report renders a Twiga-branded GT table. Best values per column are highlighted in teal.

How to read the metrics

Metric	Formula	Lower is better?	Rule of thumb
MAE	mean\|actual − forecast\|	✓	In the same units as the target (kW here)
RMSE	√mean(actual − forecast)²	✓	Penalises large errors more than MAE
Corr	Pearson r	✗	> 0.95 = excellent; > 0.90 = good
WMAPE	Σ\|error\| / Σ\|actual\|	✓	< 5% = excellent; < 10% = good

res = metric.groupby("Model")[["mae", "corr", "nbias", "rmse", "wmape", "smape"]].mean().round(2).reset_index()
res = res.rename(
    columns={"mae": "MAE", "corr": "Corr", "wmape": "WMAPE", "smape": "SMAPE", "nbias": "NBIAS", "rmse": "RMSE"}
)

metric_name = ["MAE", "Corr", "SMAPE", "RMSE"]
minimize_cols = ["MAE", "SMAPE", "RMSE"]
maximize_cols = ["Corr"]

twiga_report(res, metric_name, minimize_cols, maximize_cols)

8. Quick plot: first 7 days of test#

plot_forecast_grid creates one panel per model and overlays the forecast on the actuals.

Reading the forecast plot

The orange/red line is the model’s prediction.
The grey line is the ground truth.
Look for systematic over- or under-prediction (bias) and how well the daily peaks and troughs are captured.
A well-calibrated model tracks the actuals closely without lagging by one cycle.

We show only the first 7 days (7 × 48 = 336 steps) to keep the plot readable. For a longer view, increase n_samples_per_model.

df = forecaster._evaluate(test_df=test_df)

p = plot_forecast_grid(
    pred,
    actual_col="Actual",
    forecast_col="forecast",
    model_col="Model",
    n_samples_per_model=12 * 48,
    y_label="Net Load (kW)",
    title="Point forecast — first 7 days of test set",
    fig_width=900,
)
p

9. API summary#

from great_tables import GT, md

from twiga.core.plot.gt import twiga_gt

api_df = pd.DataFrame(
    {
        "Object / Method": [
            "DataPipelineConfig",
            "ForecasterConfig",
            "LIGHTGBMConfig",
            "TwigaForecaster(...)",
            ".fit(train_df, val_df)",
            ".evaluate_point_forecast(test_df)",
            "twiga_report",
            "plot_forecast_grid",
        ],
        "Step": ["Configure", "Configure", "Configure", "Assemble", "Train", "Evaluate", "Visualise", "Visualise"],
        "What it does": [
            "Declares the forecasting problem: target column, resolution, location, features, horizon & lookback",
            "Sets the CV strategy: window units, training window size, test-fold size",
            "Holds LightGBM hyperparameters — defaults work for a first run",
            "Wires the three configs together into a single forecasting object",
            "Engineers features, scales data, and trains the model across CV folds",
            "Runs the CV evaluation loop; returns (predictions DataFrame, metrics DataFrame)",
            "Renders a Twiga-branded GT table with best-value highlighting (requires great_tables)",
            "Overlays forecast vs. actuals in a grid of panels, one per model",
        ],
    }
)

twiga_gt(
    GT(api_df)
    .tab_header(
        title=md("**Tutorial 01 — API Quick Reference**"),
        subtitle="Core objects and methods used in this notebook",
    )
    .cols_label(
        **{
            "Object / Method": md("**Object / Method**"),
            "Step": md("**Step**"),
            "What it does": md("**What it does**"),
        }
    )
    .tab_source_note("Full API docs → twiga-forecast.readthedocs.io"),
    n_rows=len(api_df),
)

Wrapping up#

What you did

Loaded and explored the MLVS-PT dataset (30-min resolution, 3 columns)
Created chronological train / validation / test splits
Configured DataPipelineConfig with calendar + exogenous features and a 48-step horizon
Set up rolling-window cross-validation with ForecasterConfig
Trained a LightGBM model with TwigaForecaster.fit()
Evaluated and interpreted MAE, RMSE, Correlation, and WMAPE
Visualised actuals vs. forecast for the first 7 days of the test set

Key takeaways for beginners

Always split time series chronologically - never shuffle rows.
Rolling cross-validation gives more reliable metrics than a single test split.
You need three config objects - one for data, one for training strategy, one for the model.
fit() handles feature engineering + scaling automatically; you don’t touch the raw data again.

What’s next?#

# ruff: noqa: E501, E701, E702
from IPython.display import HTML

_TEAL = "#107591"
_TEAL_MID = "#069fac"
_TEAL_LIGHT = "#e8f5f8"
_TEAL_BEST = "#d0ecf1"
_TEXT_DARK = "#2d3748"
_TEXT_MUTED = "#718096"
_WHITE = "#ffffff"

steps = [
    {
        "num": "01",
        "title": "Getting Started",
        "desc": "Load data · configure pipeline · train LightGBM · evaluate",
        "tags": ["data", "config", "train", "evaluate"],
        "active": True,
    },
    {
        "num": "02",
        "title": "Forecastability Analysis",
        "desc": "Measure how predictable your signal is — set realistic expectations",
        "tags": ["analysis", "entropy", "ACF"],
        "active": False,
    },
    {
        "num": "03",
        "title": "Feature Engineering",
        "desc": "Lag, rolling-window, and calendar features; feature matrix inspection",
        "tags": ["features", "lags", "windows", "calendar"],
        "active": False,
    },
    {
        "num": "04",
        "title": "Time Series Differencing",
        "desc": "Stationarity · first-order and seasonal differencing · inversion",
        "tags": ["differencing", "stationarity"],
        "active": False,
    },
    {
        "num": "05",
        "title": "ML Point Forecasting",
        "desc": "CatBoost · XGBoost · LightGBM · model comparison",
        "tags": ["catboost", "xgboost", "lightgbm"],
        "active": False,
    },
]
track_name = "Beginner Track"
footer = 'After completing the beginner track, explore <span style="color:#107591;font-weight:600;">probabilistic forecasting</span> (08–10), <span style="color:#107591;font-weight:600;">hyperparameter tuning</span> (11), and <span style="color:#107591;font-weight:600;">neural networks</span> (07).'


def _b(t, bg, fg):
    return f'<span style="display:inline-block;background:{bg};color:{fg};font-size:10px;font-weight:600;padding:2px 7px;border-radius:10px;margin:2px 2px 0 0;">{t}</span>'


ch = ""
for i, s in enumerate(steps):
    a = s["active"]
    cb = _TEAL if a else _WHITE
    cbo = _TEAL if a else "#d1ecf1"
    nb = _TEAL_MID if a else _TEAL_LIGHT
    nf = _WHITE if a else _TEAL
    tf = _WHITE if a else _TEXT_DARK
    df = "#cce8ef" if a else _TEXT_MUTED
    bb = "#0d5f75" if a else _TEAL_BEST
    bf = "#b8e4ed" if a else _TEAL
    yh = (
        f'<span style="float:right;background:{_TEAL_MID};color:{_WHITE};font-size:10px;font-weight:700;padding:2px 10px;border-radius:12px;">★ you are here</span>'
        if a
        else ""
    )
    bdg = "".join(_b(t, bb, bf) for t in s["tags"])
    ch += f'<div style="background:{cb};border:2px solid {cbo};border-radius:12px;padding:16px 20px;display:flex;align-items:flex-start;gap:16px;box-shadow:{"0 4px 14px rgba(16,117,145,.25)" if a else "0 1px 4px rgba(0,0,0,.06)"};"><div style="min-width:44px;height:44px;background:{nb};color:{nf};border-radius:50%;display:flex;align-items:center;justify-content:center;font-size:15px;font-weight:800;flex-shrink:0;">{s["num"]}</div><div style="flex:1;"><div style="font-size:15px;font-weight:700;color:{tf};margin-bottom:4px;">{s["title"]}{yh}</div><div style="font-size:12.5px;color:{df};margin-bottom:8px;line-height:1.5;">{s["desc"]}</div><div>{bdg}</div></div></div>'
    if i < len(steps) - 1:
        ch += f'<div style="display:flex;justify-content:center;height:32px;"><svg width="24" height="32" viewBox="0 0 24 32" fill="none"><line x1="12" y1="0" x2="12" y2="24" stroke="{_TEAL_MID}" stroke-width="2" stroke-dasharray="4 3"/><polygon points="6,20 18,20 12,30" fill="{_TEAL_MID}"/></svg></div>'

HTML(
    f'<div style="font-family:Inter,\'Segoe UI\',sans-serif;max-width:640px;margin:8px 0;"><div style="background:linear-gradient(135deg,{_TEAL} 0%,{_TEAL_MID} 100%);border-radius:12px 12px 0 0;padding:14px 20px;display:flex;align-items:center;gap:10px;"><svg width="22" height="22" viewBox="0 0 24 24" fill="none" stroke="{_WHITE}" stroke-width="2"><path d="M12 2L2 7l10 5 10-5-10-5z"/><path d="M2 17l10 5 10-5"/><path d="M2 12l10 5 10-5"/></svg><span style="color:{_WHITE};font-size:14px;font-weight:700;">Twiga Learning Path — {track_name}</span></div><div style="border:2px solid {_TEAL_LIGHT};border-top:none;border-radius:0 0 12px 12px;padding:20px 20px 16px;background:#f9fdfe;display:flex;flex-direction:column;">{ch}<div style="margin-top:16px;font-size:11.5px;color:{_TEXT_MUTED};text-align:center;border-top:1px solid {_TEAL_LIGHT};padding-top:12px;">{footer}</div></div></div>'
)

Parameter	Value	Real time
`lookback_window_size`	96	48 hours of history
`forecast_horizon`	48	24 hours ahead