Time Series Differencing#

What you’ll build

A stationarity-aware preprocessing pipeline for net-load data: detect non-stationarity with ADF and KPSS tests, apply first-order and seasonal differencing with TimeSeriesDifferentiator, train a LightGBM model on the transformed target, and invert the transformation to recover physical-unit forecasts.

Prerequisites

01 - Getting Started (DataPipelineConfig, ForecasterConfig, TwigaForecaster.fit)
02 - Forecastability Analysis (ADF/KPSS stationarity tests)
03 - Feature Engineering (lag features and the feature matrix)
Python: basic statistics, numpy familiarity

Learning objectives

By the end of this notebook you will be able to:

Diagnose non-stationarity in a time series using ADF and KPSS tests and interpret their output
Apply first-order and seasonal differencing using TimeSeriesDifferentiator from Twiga
Explain the difference between first-order differencing and seasonal differencing and when to use each
Train a LightGBM model on a differenced target and correctly invert the transformation at inference time
Identify common mistakes (leaking the inversion step, double-differencing) and how to avoid them

1. Setup#

import warnings

warnings.filterwarnings("ignore")

from great_tables import GT
from lets_plot import LetsPlot
import numpy as np
import pandas as pd
from sklearn.preprocessing import StandardScaler
from statsmodels.tsa.stattools import adfuller

LetsPlot.setup_html()

from twiga.core.plot import (
    plot_acf,
    plot_forecast_grid,
    plot_timeseries,
)
from twiga.core.utils import configure, get_logger

configure()
log = get_logger("tutorials")

data = pd.read_parquet("../data/MLVS-PT.parquet", columns=["timestamp", "NetLoad(kW)"])
data["timestamp"] = pd.to_datetime(data["timestamp"])
data = data[(data["timestamp"] >= "2019-01-01") & (data["timestamp"] < "2020-07-01")].copy()
data = data.reset_index(drop=True)

log.info("Shape : %s", data.shape)
log.info("Period: %s → %s", data["timestamp"].min(), data["timestamp"].max())
GT(data.head())

---------------------------------------------------------------------------
FileNotFoundError                         Traceback (most recent call last)
Cell In[2], line 1
----> 1 data = pd.read_parquet("../data/MLVS-PT.parquet", columns=["timestamp", "NetLoad(kW)"])
data["timestamp"] = pd.to_datetime(data["timestamp"])
data = data[(data["timestamp"] >= "2019-01-01") & (data["timestamp"] < "2020-07-01")].copy()
data = data.reset_index(drop=True)

File ~/work/twiga-forecast/twiga-forecast/.venv/lib/python3.12/site-packages/pandas/io/parquet.py:669, in read_parquet(path, engine, columns, storage_options, use_nullable_dtypes, dtype_backend, filesystem, filters, **kwargs)
   use_nullable_dtypes = False
check_dtype_backend(dtype_backend)
--> 669 return impl.read(
   path,
   columns=columns,
   filters=filters,
   storage_options=storage_options,
   use_nullable_dtypes=use_nullable_dtypes,
   dtype_backend=dtype_backend,
   filesystem=filesystem,
   **kwargs,
)

File ~/work/twiga-forecast/twiga-forecast/.venv/lib/python3.12/site-packages/pandas/io/parquet.py:258, in PyArrowImpl.read(self, path, columns, filters, use_nullable_dtypes, dtype_backend, storage_options, filesystem, **kwargs)
if manager == "array":
   to_pandas_kwargs["split_blocks"] = True
--> 258 path_or_handle, handles, filesystem = _get_path_or_handle(
   path,
   filesystem,
   storage_options=storage_options,
   mode="rb",
)
try:
   pa_table = self.api.parquet.read_table(
       path_or_handle,
       columns=columns,
   (...)    270         **kwargs,
   )

File ~/work/twiga-forecast/twiga-forecast/.venv/lib/python3.12/site-packages/pandas/io/parquet.py:141, in _get_path_or_handle(path, fs, storage_options, mode, is_dir)
handles = None
if (
   not fs
   and not is_dir
   (...)    139     # fsspec resources can also point to directories
   # this branch is used for example when reading from non-fsspec URLs
--> 141     handles = get_handle(
       path_or_handle, mode, is_text=False, storage_options=storage_options
   )
   fs = None
   path_or_handle = handles.handle

File ~/work/twiga-forecast/twiga-forecast/.venv/lib/python3.12/site-packages/pandas/io/common.py:882, in get_handle(path_or_buf, mode, encoding, compression, memory_map, is_text, errors, storage_options)
       handle = open(
           handle,
           ioargs.mode,
   (...)    878             newline="",
       )
   else:
       # Binary mode
--> 882         handle = open(handle, ioargs.mode)
   handles.append(handle)
# Convert BytesIO or file objects passed with an encoding

FileNotFoundError: [Errno 2] No such file or directory: '../data/MLVS-PT.parquet'

2. Stationarity: Diagnosis#

A stationary time series has constant mean and variance over time.
Many regression-based forecasters (including tree models) implicitly assume the target distribution is roughly constant across the training window.
Strong trends or slowly drifting means violate this and can degrade generalisation.

Augmented Dickey-Fuller test#

The ADF test examines the null hypothesis that a unit root is present (i.e., the series is non-stationary).
A p-value > 0.05 means we cannot reject the null - the series is likely non-stationary.

Key concept - stationarity

A stationary time series has a constant mean, constant variance, and autocorrelation structure that does not change over time. Many regression-based forecasters (tree models, linear models) implicitly assume the target distribution is roughly stable across the training window. When the mean drifts or variance grows, the model sees a different distribution at test time than it saw during training - and generalisation degrades.

Two common tests:

ADF (Augmented Dickey-Fuller) - null hypothesis: the series has a unit root (non-stationary). A p-value > 0.05 means we cannot reject non-stationarity.

KPSS - null hypothesis: the series is stationary. Use both together for a more robust diagnosis.

Rolling statistics are a fast visual check: if the rolling mean or standard deviation drifts significantly, the series is likely non-stationary.

series = data["NetLoad(kW)"].dropna().values

adf_result = adfuller(series, autolag="AIC")
log.info("=== ADF Test — raw NetLoad(kW) ===")
log.info("  ADF statistic : %.4f", adf_result[0])
log.info("  p-value       : %.6f", adf_result[1])
log.info("  Critical vals : %s", {k: f"{v:.4f}" for k, v in adf_result[4].items()})
conclusion = "STATIONARY (p ≤ 0.05)" if adf_result[1] <= 0.05 else "NON-STATIONARY (p > 0.05)"
log.info("  Conclusion    : %s", conclusion)

# Rolling mean / std — visual check
s = data.set_index("timestamp")["NetLoad(kW)"]
roll = s.rolling(window=48 * 7)  # 7-day window

roll_df = pd.DataFrame(
    {
        "timestamp": s.index,
        "NetLoad(kW)": s.values,
        "Rolling Mean": roll.mean().values,
        "Rolling Std": roll.std().values,
    }
).dropna()

p1 = plot_timeseries(
    roll_df,
    y_cols=["NetLoad(kW)", "Rolling Mean"],
    date_col="timestamp",
    title="NetLoad — raw series with 7-day rolling mean",
    y_label="kW",
)
p1

# ACF of raw series — slow decay signals non-stationarity
p = plot_acf(
    pd.Series(series, name="NetLoad(kW)"),
    max_lag=96,
    title="ACF — raw NetLoad(kW) (slow decay = non-stationary)",
)
p

3. `TimeSeriesDifferentiator` API#

TimeSeriesDifferentiator is a scikit-learn TransformerMixin that applies N-th order differencing and stores the boundary values needed for perfect reconstruction. Because it inherits from sklearn.base.TransformerMixin, you also get fit_transform for free.

from great_tables import GT, md
import pandas as pd

from twiga.core.plot.gt import twiga_gt

diff_api_df = pd.DataFrame(
    {
        "Method": [
            "`fit(X)`",
            "`transform(X)`",
            "`inverse_transform(X)`",
            "`forecast_inverse(predictions)`",
            "`fit_transform(X)`",
        ],
        "Purpose": [
            "Store initial/last values of each order for later inversion",
            "Apply N-th order `np.diff`; pads leading NaNs to preserve length",
            "Reconstruct the original scale from diff + stored anchors",
            "Project future diffs into original scale using last training values",
            "Convenience: `fit` + `transform` in one call (inherited from sklearn)",
        ],
    }
)

twiga_gt(
    GT(diff_api_df)
    .tab_header(
        title=md("**TimeSeriesDifferentiator — API Reference**"),
        subtitle="Scikit-learn compatible transformer for N-th order differencing",
    )
    .cols_label(**{c: md(f"**{c}**") for c in diff_api_df.columns})
    .tab_source_note("twiga/core/data/diff.py"),
    n_rows=len(diff_api_df),
)

Key concept - first-order differencing

First-order differencing transforms a series by subtracting each value from its predecessor: Δy_t = y_t − y_{t−1}. This removes linear trends - if the original series drifts upward at a roughly constant rate, the differenced series will oscillate around zero.

Seasonal differencing subtracts a lag equal to the seasonal period (e.g., lag-48 for a 30-min series with a 24-hour cycle): Δy_t = y_t − y_{t−48}. This removes repeating patterns without destroying the trend information.

The TimeSeriesDifferentiator stores boundary values (the initial and final observations at each order) so the transformation can be perfectly inverted. No information is lost - the original series can be reconstructed exactly from the differenced series plus the stored anchors.

from twiga.core.data.diff import TimeSeriesDifferentiator

diff1 = TimeSeriesDifferentiator(order=1)
diff1.fit(series)

log.info("=== Fitted differentiator ===")
log.info("  order          : %s", diff1.order)
log.info("  initial_values : %s", diff1.history_["initial_values"])
log.info("  last_values    : %s", diff1.history_["last_values"])

diffed = diff1.transform(series)
log.info("Original length : %d", len(series))
log.info("Diffed length   : %d", len(diffed))
log.info("Leading NaNs    : %d  (equal to order=1)", np.sum(np.isnan(diffed)))
log.info("Diffed stats    — mean=%.4f  std=%.4f", np.nanmean(diffed), np.nanstd(diffed))

# Verify perfect round-trip reconstruction
reconstructed = diff1.inverse_transform(diffed)
max_error = np.max(np.abs(reconstructed - series))
log.info("Max reconstruction error: %.2e  (should be ~0)", max_error)
assert max_error < 1e-8, "Reconstruction failed!"  # noqa: S101
log.info("Round-trip test passed.")

# Side-by-side: raw vs differenced
raw_df = pd.DataFrame({"timestamp": data["timestamp"], "NetLoad(kW)": series})
diff_df = pd.DataFrame({"timestamp": data["timestamp"], "Differenced": diffed})

p1 = plot_timeseries(
    raw_df,
    y_cols=["NetLoad(kW)"],
    date_col="timestamp",
    title="Raw NetLoad(kW)",
    y_label="kW",
)
p1

p2 = plot_timeseries(
    diff_df.dropna(),
    y_cols=["Differenced"],
    date_col="timestamp",
    title="1st-order differenced NetLoad(kW)",
    y_label="Δ kW",
)
p2

4. Stationarity After Differencing#

diffed_clean = diffed[~np.isnan(diffed)]

adf_diff = adfuller(diffed_clean, autolag="AIC")
log.info("=== ADF Test — 1st-order differenced ===")
log.info("  ADF statistic : %.4f", adf_diff[0])
log.info("  p-value       : %.6f", adf_diff[1])
conclusion_diff = "STATIONARY (p ≤ 0.05)" if adf_diff[1] <= 0.05 else "NON-STATIONARY (p > 0.05)"
log.info("  Conclusion    : %s", conclusion_diff)

# ACF after differencing — should decay faster
p = plot_acf(
    pd.Series(diffed_clean, name="Differenced NetLoad(kW)"),
    max_lag=96,
    title="ACF — 1st-order differenced NetLoad (faster decay = more stationary)",
)
p

Key concept - inversion

When you forecast on a differenced series, your model predicts changes (Δy), not levels (y). To get back to the original scale you must integrate the changes - that is, take a cumulative sum starting from a known anchor.

For batch regression (where each step is predicted independently), the correct anchor is always the actual previous value, not a predicted one: ŷ_t = y_{t−1}^{actual} + Δŷ_t. This keeps errors from accumulating - each step’s error is independent.

For AR horizon forecasting (where a model generates an entire horizon window in one shot, like a neural network), the deltas are chained intentionally within that window: ŷ_1 = y_last^{train} + Δŷ_1, ŷ_2 = ŷ_1 + Δŷ_2, … TimeSeriesDifferentiator.forecast_inverse(horizon_deltas) implements this correctly.

Twiga handles both patterns - you choose the right one based on your model architecture.

5. Forecasting on the Differenced Target#

The correct workflow for batch regression (a model that predicts each timestep independently, like CatBoost) is:

Raw series
  │
  ├─ fit(train)        → store anchors
  ├─ transform(train)  → Δ_train  (train the model on Δy)
  │
  └─ reconstruction:   y_t = y_{t-1}_actual + Δy_t_predicted
                            ↑ actual, not predicted - errors never accumulate

forecast_inverse is designed for sequential AR forecasting - a model that generates a single horizon window all at once (e.g. 48-step-ahead), where chaining the deltas within that window is intentional. For independent step-by-step batch regression, use actual_prev + Δy_pred.

from great_tables import GT, md
import pandas as pd

from twiga.core.plot.gt import twiga_gt

mistakes_df = pd.DataFrame(
    {
        "Mistake": [
            "`forecast_inverse(all_test_preds)`",
            "Differencing lag features",
        ],
        "Why it's wrong": [
            "Applies `cumsum` over all test steps — every per-step error stacks on all future steps",
            "Removes the level information the model needs to predict the current level",
        ],
        "Fix": [
            "Use `actual_prev + Δy_pred` instead",
            "Keep lag features in **original scale**",
        ],
    }
)

twiga_gt(
    GT(mistakes_df)
    .tab_header(
        title=md("**Two Common Mistakes That Destroy Performance**"),
        subtitle="Avoid these when building a differenced forecasting pipeline",
    )
    .cols_label(**{c: md(f"**{c}**") for c in mistakes_df.columns})
    .tab_source_note("Twiga Forecast"),
    n_rows=len(mistakes_df),
)

# Build a simple feature matrix with lag-48 (same time yesterday)
TARGET = "NetLoad(kW)"
N_LAGS = [48, 96, 336]  # 1 day, 2 days, 7 days

df = data.copy()
for lag in N_LAGS:
    df[f"lag_{lag}"] = df[TARGET].shift(lag)

df = df.dropna().reset_index(drop=True)
feature_cols = [f"lag_{lag}" for lag in N_LAGS]

# Train/test split (last 30 days as test)
n_test = 48 * 30  # 30 days of 30-min data
train_df = df.iloc[:-n_test].copy()
test_df = df.iloc[-n_test:].copy()

log.info("Train: %d rows | Test: %d rows", len(train_df), len(test_df))

from catboost import CatBoostRegressor

# ── Baseline: train on RAW target ──────────────────────────────────────────
cb_raw = CatBoostRegressor(iterations=300, depth=6, learning_rate=0.05, verbose=0, random_seed=42)
cb_raw.fit(train_df[feature_cols], train_df[TARGET])
pred_raw = cb_raw.predict(test_df[feature_cols])

log.info("Baseline (raw target) model trained.")

# ── Differenced model ──────────────────────────────────────────────────────
# Fit differentiator on TRAIN only (prevent leakage)
diff_model = TimeSeriesDifferentiator(order=1)
diff_model.fit(train_df[TARGET].values)

# Differentiate the target — prepend last train value to bridge train→test boundary
diff_target_train = diff_model.transform(train_df[TARGET].values)
diff_target_test = diff_model.transform(np.concatenate([[train_df[TARGET].values[-1]], test_df[TARGET].values]))[
    1:
]  # drop the 1 leading NaN (order=1)

# Lag features stay in ORIGINAL scale — they carry level information the model needs.
# Differencing them would strip the "where are we now" context from every prediction.
X_train_diff = train_df[feature_cols].copy()
X_test_diff = test_df[feature_cols].copy()

# Filter out NaN row produced by differencing (first train row)
train_mask = ~np.isnan(diff_target_train)
cb_diff = CatBoostRegressor(iterations=300, depth=6, learning_rate=0.05, verbose=0, random_seed=42)
cb_diff.fit(X_train_diff.values[train_mask], diff_target_train[train_mask])

pred_diff_delta = cb_diff.predict(X_test_diff.values)

# ── Correct reconstruction for batch regression ─────────────────────────────
# y_t = y_{t-1}_actual + Δy_t_predicted
# Each prediction uses the REAL previous value — errors never accumulate.
actual_prev = np.concatenate([[train_df[TARGET].values[-1]], test_df[TARGET].values[:-1]])
pred_diff = actual_prev + pred_diff_delta

log.info("Differenced model trained. Reconstruction: y_t = y_{t-1}_actual + Δy_t_pred")

6. Evaluation & Impact#

y_true = test_df[TARGET].values


def mae(y, yhat):
    return np.mean(np.abs(y - yhat))


def rmse(y, yhat):
    return np.sqrt(np.mean((y - yhat) ** 2))


def wmape(y, yhat):
    return np.sum(np.abs(y - yhat)) / np.sum(np.abs(y)) * 100


results = pd.DataFrame(
    {
        "Model": ["CatBoost (raw)", "CatBoost (differenced)"],
        "MAE": [mae(y_true, pred_raw), mae(y_true, pred_diff)],
        "RMSE": [rmse(y_true, pred_raw), rmse(y_true, pred_diff)],
        "WMAPE (%)": [wmape(y_true, pred_raw), wmape(y_true, pred_diff)],
    }
).round(3)

log.info("\n%s", results.to_string(index=False))
results

# Visualise predictions — first 7 days of the test window
n_show = 48 * 7
ts = test_df["timestamp"].values[:n_show]

raw_model_df = pd.DataFrame({"Actual": y_true[:n_show], "forecast": pred_raw[:n_show], "Model": "CatBoost (raw)"})
diff_model_df = pd.DataFrame(
    {"Actual": y_true[:n_show], "forecast": pred_diff[:n_show], "Model": "CatBoost (differenced)"}
)
combined_forecast = pd.concat([raw_model_df, diff_model_df], ignore_index=True)

p = plot_forecast_grid(
    combined_forecast,
    actual_col="Actual",
    forecast_col="forecast",
    model_col="Model",
    n_samples_per_model=n_show,
    title="CatBoost raw vs differenced target — first 7 days of test",
    y_label="kW",
)
p

# Residual comparison
res_raw = y_true - pred_raw
res_diff = y_true - pred_diff

residuals_df = pd.DataFrame(
    {
        "value": np.concatenate([res_raw, res_diff]),
        "distribution": ["Raw residuals"] * len(res_raw) + ["Differenced residuals"] * len(res_diff),
    }
)

from twiga.core.plot import plot_density

p = plot_density(
    residuals_df,
    x_col="value",
    color_col="distribution",
    title="Residual distribution — Raw vs Differenced model",
    x_label="Residual (kW)",
)
p

7. Second-Order Differencing#

Some series require two rounds of differencing to reach stationarity (e.g., series with both a trend and seasonal drift).
TimeSeriesDifferentiator(order=2) applies np.diff twice and stores two sets of anchor values.

diff2 = TimeSeriesDifferentiator(order=2)
diff2.fit(series)

diffed2 = diff2.transform(series)
log.info("2nd-order diff — leading NaNs: %d", np.sum(np.isnan(diffed2)))

# Verify round-trip for order=2
reconstructed2 = diff2.inverse_transform(diffed2)
max_err2 = np.max(np.abs(reconstructed2 - series))
log.info("Max reconstruction error (order=2): %.2e", max_err2)
assert max_err2 < 1e-6, "2nd-order round-trip failed!"  # noqa: S101
log.info("2nd-order round-trip test passed.")

# ADF on 2nd-order differenced series
diffed2_clean = diffed2[~np.isnan(diffed2)]
adf2 = adfuller(diffed2_clean, autolag="AIC")
log.info("ADF p-value (raw):         %.6f", adf_result[1])
log.info("ADF p-value (1st order):   %.6f", adf_diff[1])
log.info("ADF p-value (2nd order):   %.6f", adf2[1])

# Visualise all three
step_idx = np.arange(len(series))
multi_diff_df = pd.DataFrame(
    {
        "step": np.concatenate([step_idx, step_idx, step_idx]),
        "value": np.concatenate([series, diffed, diffed2]),
        "Series": (
            ["Raw NetLoad"] * len(series) + ["1st-order diff"] * len(diffed) + ["2nd-order diff"] * len(diffed2)
        ),
    }
)

from twiga.core.plot import plot_density

for series_name in ["Raw NetLoad", "1st-order diff", "2nd-order diff"]:
    sub = multi_diff_df[multi_diff_df["Series"] == series_name].dropna()
    p = plot_timeseries(
        sub,
        y_cols=["value"],
        date_col="step",
        title=series_name,
        y_label="value",
    )
    display(p)

8. `forecast_inverse`: When to Use It#

forecast_inverse is correct when a model produces an entire forecast horizon in one shot (e.g. a neural network that outputs 48 future deltas at once). It chains all deltas together starting from the last observed training value.

This is intentional chaining, not an error - within a single forward pass the deltas are sequentially dependent.

last_train_value
   └─→ + Δ_pred[0] = ŷ_1
          └─→ + Δ_pred[1] = ŷ_2
                 └─→ + Δ_pred[2] = ŷ_3  ...

Use actual_prev + Δy_pred for batch regression; use forecast_inverse for single-horizon AR rollouts.

# Simulate a short future forecast scenario
HORIZON = 48  # one day ahead

# Toy: suppose our model predicted a constant delta of +0.5 kW per step
toy_future_diffs = np.full(HORIZON, 0.5)

# Invert using the 1st-order differentiator fitted on train
toy_forecast = diff_model.forecast_inverse(toy_future_diffs)

last_train_val = train_df[TARGET].values[-1]
log.info("Last training value     : %.3f kW", last_train_val)
log.info("Forecasted values[0:5]  : %s", np.round(toy_forecast[:5], 3))
log.info("Expected values[0:5]    : %s", np.round(last_train_val + np.cumsum(toy_future_diffs)[:5], 3))

forecast_df_plot = pd.DataFrame({"step": np.arange(HORIZON), "value": toy_forecast})
p = plot_timeseries(
    forecast_df_plot,
    y_cols=["value"],
    date_col="step",
    title="forecast_inverse — integrating future deltas back to original scale",
    y_label="kW",
    x_label="Forecast step",
)
p

9. When to Use Differencing#

Key caveats:

Always fit the differentiator on training data only to prevent leakage.
Do not difference lag features: they provide the level context needed to predict levels.
Keep the fitted differentiator at inference time for reconstruction.
inverse_transform reconstructs the full history (needs leading NaNs); forecast_inverse is for future horizons only.

from great_tables import GT, md
import pandas as pd

from twiga.core.plot.gt import twiga_gt

when_to_diff_df = pd.DataFrame(
    {
        "Situation": [
            "ADF p-value > 0.05, rolling mean drifts",
            "1st-order diff still non-stationary",
            "Strong seasonal pattern (e.g., 48-step)",
            "Series is already stationary",
            "Neural network models (MLPF, MLPGAM)",
            "Tree-based models (CatBoost, LightGBM)",
        ],
        "Recommended action": [
            "Apply 1st-order differencing",
            "Try 2nd-order differencing",
            "Use seasonal differencing: `np.diff(x, n=1)` with period lag",
            "Skip — differencing adds variance without benefit",
            "Usually unnecessary — networks learn trends internally",
            "Often helps when target distribution shifts over time",
        ],
    }
)

reconstruction_df = pd.DataFrame(
    {
        "Model type": [
            "Batch regression (CatBoost, LightGBM, XGBoost) — predicts each step independently",
            "Horizon AR (neural net / ARIMA forecasting a full window at once)",
        ],
        "Correct inversion": [
            "`y_t = y_{t-1}_actual + Δy_t_predicted`",
            "`diff.forecast_inverse(horizon_deltas)`",
        ],
    }
)

twiga_gt(
    GT(when_to_diff_df)
    .tab_header(title=md("**When to Apply Differencing**"), subtitle="Situation-by-situation decision guide")
    .cols_label(**{c: md(f"**{c}**") for c in when_to_diff_df.columns})
    .tab_source_note("Twiga Forecast"),
    n_rows=len(when_to_diff_df),
)

from great_tables import GT, md
import pandas as pd

from twiga.core.plot.gt import twiga_gt

reconstruction_df = pd.DataFrame(
    {
        "Model type": [
            "Batch regression (CatBoost, LightGBM, XGBoost) — predicts each step independently",
            "Horizon AR (neural net / ARIMA — generates a full horizon window at once)",
        ],
        "Correct inversion": [
            "`y_t = y_{t-1}_actual + Δy_t_predicted`",
            "`diff.forecast_inverse(horizon_deltas)`",
        ],
    }
)

twiga_gt(
    GT(reconstruction_df)
    .tab_header(
        title=md("**Reconstruction Rules by Model Type**"),
        subtitle="Choose the right inversion strategy based on how your model predicts",
    )
    .cols_label(**{c: md(f"**{c}**") for c in reconstruction_df.columns})
    .tab_source_note("Twiga Forecast"),
    n_rows=len(reconstruction_df),
)

10. Workflow Summary#

from great_tables import GT, md
import pandas as pd

from twiga.core.plot.gt import twiga_gt

workflow_df = pd.DataFrame(
    {
        "Step": [
            "1. Diagnose stationarity",
            "2. Fit differentiator",
            "3. Differentiate target",
            "4. Train model on Δ",
            "5. Invert (batch regression)",
            "6. Invert (AR horizon forecast)",
            "7. Reconstruct full history",
        ],
        "Code": [
            "`adfuller(series)`, rolling mean/std, ACF",
            "`diff = TimeSeriesDifferentiator(order=1); diff.fit(train_series)`",
            "`y_diff = diff.transform(train_series)`",
            "`model.fit(X_train_original_scale, y_diff[~np.isnan(y_diff)])`",
            "`actual_prev + model.predict(X_test)`",
            "`diff.forecast_inverse(horizon_deltas)`",
            "`diff.inverse_transform(y_diff)`",
        ],
    }
)

twiga_gt(
    GT(workflow_df)
    .tab_header(
        title=md("**Differenced Forecasting Pipeline — Step by Step**"),
        subtitle="End-to-end workflow from raw series to inverted predictions",
    )
    .cols_label(**{c: md(f"**{c}**") for c in workflow_df.columns})
    .tab_source_note("Twiga Forecast"),
    n_rows=len(workflow_df),
)

Wrapping up#

What you did

Diagnosed non-stationarity in a net-load series using the ADF test, rolling statistics, and ACF
Applied first-order differencing with TimeSeriesDifferentiator and verified perfect round-trip reconstruction
Confirmed that the differenced series is stationary using the ADF test
Built a correct diff → train → predict → invert pipeline using batch-regression reconstruction
Demonstrated second-order differencing for series that require two rounds of transformation
Used forecast_inverse for the AR horizon use case and explained when to choose each inversion strategy
Compared MAE/RMSE/WMAPE of raw vs. differenced CatBoost models on the same test window

Key takeaways

Stationarity is a practical concern, not just theory - a drifting target distribution degrades tree model generalisation at test time.
TimeSeriesDifferentiator is lossless: the original series can always be recovered exactly from the differenced series + stored anchors.
For batch regression, inversion is ŷ_t = y_{t−1}^{actual} + Δŷ_t - never accumulate predicted errors.
Always fit the differentiator on training data only; use original-scale lag features to preserve level context.
forecast_inverse is the right tool for AR horizon models that generate a full window at once.

What’s next?#

Combine differencing with the full Twiga pipeline in NB04 - ML Point Forecasting by pre-processing your DataFrame with TimeSeriesDifferentiator before passing it to TwigaForecaster, or explore NB14 - Typed Forecast Results for structured output handling.

# ruff: noqa: E501, E701, E702
from IPython.display import HTML

_TEAL = "#107591"
_TEAL_MID = "#069fac"
_TEAL_LIGHT = "#e8f5f8"
_TEAL_BEST = "#d0ecf1"
_TEXT_DARK = "#2d3748"
_TEXT_MUTED = "#718096"
_WHITE = "#ffffff"

steps = [
    {
        "num": "03",
        "title": "Feature Engineering",
        "desc": "Lag, rolling-window, and calendar features",
        "tags": ["features", "lags", "windows"],
        "active": False,
    },
    {
        "num": "04",
        "title": "Time Series Differencing",
        "desc": "Stationarity · first-order and seasonal differencing · inversion",
        "tags": ["differencing", "stationarity", "ADF"],
        "active": True,
    },
    {
        "num": "05",
        "title": "ML Point Forecasting",
        "desc": "CatBoost · XGBoost · LightGBM · model comparison",
        "tags": ["catboost", "xgboost", "lightgbm"],
        "active": False,
    },
    {
        "num": "06",
        "title": "Backtesting & Evaluation",
        "desc": "Rolling-window backtesting · fold-level metrics",
        "tags": ["backtesting", "evaluation"],
        "active": False,
    },
    {
        "num": "08",
        "title": "Quantile Regression",
        "desc": "First probabilistic step — prediction intervals",
        "tags": ["probabilistic", "quantile", "intervals"],
        "active": False,
    },
]
track_name = "Intermediate Track"
footer = 'Next: build models on the transformed target in <span style="color:#107591;font-weight:600;">05 — ML Point Forecasting</span>.'


def _b(t, bg, fg):
    return f'<span style="display:inline-block;background:{bg};color:{fg};font-size:10px;font-weight:600;padding:2px 7px;border-radius:10px;margin:2px 2px 0 0;">{t}</span>'


ch = ""
for i, s in enumerate(steps):
    a = s["active"]
    cb = _TEAL if a else _WHITE
    cbo = _TEAL if a else "#d1ecf1"
    nb = _TEAL_MID if a else _TEAL_LIGHT
    nf = _WHITE if a else _TEAL
    tf = _WHITE if a else _TEXT_DARK
    df = "#cce8ef" if a else _TEXT_MUTED
    bb = "#0d5f75" if a else _TEAL_BEST
    bf = "#b8e4ed" if a else _TEAL
    yh = (
        f'<span style="float:right;background:{_TEAL_MID};color:{_WHITE};font-size:10px;font-weight:700;padding:2px 10px;border-radius:12px;">★ you are here</span>'
        if a
        else ""
    )
    bdg = "".join(_b(t, bb, bf) for t in s["tags"])
    ch += f'<div style="background:{cb};border:2px solid {cbo};border-radius:12px;padding:16px 20px;display:flex;align-items:flex-start;gap:16px;box-shadow:{"0 4px 14px rgba(16,117,145,.25)" if a else "0 1px 4px rgba(0,0,0,.06)"};"><div style="min-width:44px;height:44px;background:{nb};color:{nf};border-radius:50%;display:flex;align-items:center;justify-content:center;font-size:15px;font-weight:800;flex-shrink:0;">{s["num"]}</div><div style="flex:1;"><div style="font-size:15px;font-weight:700;color:{tf};margin-bottom:4px;">{s["title"]}{yh}</div><div style="font-size:12.5px;color:{df};margin-bottom:8px;line-height:1.5;">{s["desc"]}</div><div>{bdg}</div></div></div>'
    if i < len(steps) - 1:
        ch += f'<div style="display:flex;justify-content:center;height:32px;"><svg width="24" height="32" viewBox="0 0 24 32" fill="none"><line x1="12" y1="0" x2="12" y2="24" stroke="{_TEAL_MID}" stroke-width="2" stroke-dasharray="4 3"/><polygon points="6,20 18,20 12,30" fill="{_TEAL_MID}"/></svg></div>'

HTML(
    f'<div style="font-family:Inter,\'Segoe UI\',sans-serif;max-width:640px;margin:8px 0;"><div style="background:linear-gradient(135deg,{_TEAL} 0%,{_TEAL_MID} 100%);border-radius:12px 12px 0 0;padding:14px 20px;display:flex;align-items:center;gap:10px;"><svg width="22" height="22" viewBox="0 0 24 24" fill="none" stroke="{_WHITE}" stroke-width="2"><path d="M12 2L2 7l10 5 10-5-10-5z"/><path d="M2 17l10 5 10-5"/><path d="M2 12l10 5 10-5"/></svg><span style="color:{_WHITE};font-size:14px;font-weight:700;">Twiga Learning Path — {track_name}</span></div><div style="border:2px solid {_TEAL_LIGHT};border-top:none;border-radius:0 0 12px 12px;padding:20px 20px 16px;background:#f9fdfe;display:flex;flex-direction:column;">{ch}<div style="margin-top:16px;font-size:11.5px;color:{_TEXT_MUTED};text-align:center;border-top:1px solid {_TEAL_LIGHT};padding-top:12px;">{footer}</div></div></div>'
)