Neural Network Models#
What you’ll build
Two neural network forecasters - a multi-layer perceptron (MLPF) and N-HiTS - trained with PyTorch Lightning on the MLVS-PT net-load dataset, benchmarked against the LightGBM baseline from NB05, and compared on MAE and training speed.
Prerequisites
01 - Getting Started (DataPipelineConfig, ForecasterConfig, TwigaForecaster.fit)
03 - Feature Engineering (understanding the (B, L, F) tensor)
05 - ML Point Forecasting (ML baseline to beat)
06 - Backtesting & Evaluation (metric interpretation)
Python: basic PyTorch awareness (not required to write any)
Learning objectives
By the end of this notebook you will be able to:
Explain when neural networks outperform gradient-boosted trees and when they do not
Configure
MLPFConfigandNHiTSConfigincluding embedding types and sequence dimensionsTrain a neural network forecaster with PyTorch Lightning using early stopping
Compare neural and ML model metrics fairly using a shared
DataPipelineConfigInterpret training curves and understand what 5-epoch results mean vs. fully converged results
Key concept - why neural networks?
Gradient boosting (LightGBM, XGBoost, CatBoost) is hard to beat on tabular data with hand-crafted features. Neural networks earn their keep when:
The dataset is large - NNs scale better with data volume; tree models plateau.
Raw sequences matter - NNs can learn from the full look-back window without manual lag selection.
You need probabilistic outputs - distribution heads (NB07 - 08) attach naturally to NN backbones.
Transfer learning is on the table - pretrained NN weights can be fine-tuned on new sites.
With only
max_epochs=5in this tutorial, LightGBM will likely win. That is intentional - it illustrates the training-budget trade-off. With ≥ 50 epochs and a proper learning-rate schedule the NN models typically match or surpass tree-based models on this dataset.
1. Setup#
import os
import warnings
from great_tables import GT, md
from IPython.display import clear_output
from lets_plot import LetsPlot
import pandas as pd
LetsPlot.setup_html()
from twiga.core.plot import (
plot_forecast,
plot_forecast_grid,
plot_metrics_bar,
)
from twiga.core.plot.gt import twiga_gt, twiga_report
from twiga.core.utils import configure, get_logger
warnings.filterwarnings("ignore")
configure()
log = get_logger("tutorials")
Load data#
The dataset covers Madeira, Portugal (32.37°N, 16.27°W) at 30-minute resolution. We only load the columns we need.
data = pd.read_parquet("../data/MLVS-PT.parquet")
data = data[["timestamp", "NetLoad(kW)", "Ghi", "Temperature"]]
data["timestamp"] = pd.to_datetime(data["timestamp"])
data = data.drop_duplicates(subset="timestamp").reset_index(drop=True)
# Restrict to 2019-2020 to keep tutorial execution fast
data = data[(data["timestamp"] >= "2019-01-01") & (data["timestamp"] <= "2020-12-31")].reset_index(drop=True)
log.info("Shape: %s", data.shape)
twiga_gt(GT(data.head().round(2)))
---------------------------------------------------------------------------
FileNotFoundError Traceback (most recent call last)
Cell In[2], line 1
----> 1 data = pd.read_parquet("../data/MLVS-PT.parquet")
2 data = data[["timestamp", "NetLoad(kW)", "Ghi", "Temperature"]]
3 data["timestamp"] = pd.to_datetime(data["timestamp"])
4 data = data.drop_duplicates(subset="timestamp").reset_index(drop=True)
File ~/work/twiga-forecast/twiga-forecast/.venv/lib/python3.12/site-packages/pandas/io/parquet.py:669, in read_parquet(path, engine, columns, storage_options, use_nullable_dtypes, dtype_backend, filesystem, filters, **kwargs)
666 use_nullable_dtypes = False
667 check_dtype_backend(dtype_backend)
--> 669 return impl.read(
670 path,
671 columns=columns,
672 filters=filters,
673 storage_options=storage_options,
674 use_nullable_dtypes=use_nullable_dtypes,
675 dtype_backend=dtype_backend,
676 filesystem=filesystem,
677 **kwargs,
678 )
File ~/work/twiga-forecast/twiga-forecast/.venv/lib/python3.12/site-packages/pandas/io/parquet.py:258, in PyArrowImpl.read(self, path, columns, filters, use_nullable_dtypes, dtype_backend, storage_options, filesystem, **kwargs)
256 if manager == "array":
257 to_pandas_kwargs["split_blocks"] = True
--> 258 path_or_handle, handles, filesystem = _get_path_or_handle(
259 path,
260 filesystem,
261 storage_options=storage_options,
262 mode="rb",
263 )
264 try:
265 pa_table = self.api.parquet.read_table(
266 path_or_handle,
267 columns=columns,
(...) 270 **kwargs,
271 )
File ~/work/twiga-forecast/twiga-forecast/.venv/lib/python3.12/site-packages/pandas/io/parquet.py:141, in _get_path_or_handle(path, fs, storage_options, mode, is_dir)
131 handles = None
132 if (
133 not fs
134 and not is_dir
(...) 139 # fsspec resources can also point to directories
140 # this branch is used for example when reading from non-fsspec URLs
--> 141 handles = get_handle(
142 path_or_handle, mode, is_text=False, storage_options=storage_options
143 )
144 fs = None
145 path_or_handle = handles.handle
File ~/work/twiga-forecast/twiga-forecast/.venv/lib/python3.12/site-packages/pandas/io/common.py:882, in get_handle(path_or_buf, mode, encoding, compression, memory_map, is_text, errors, storage_options)
873 handle = open(
874 handle,
875 ioargs.mode,
(...) 878 newline="",
879 )
880 else:
881 # Binary mode
--> 882 handle = open(handle, ioargs.mode)
883 handles.append(handle)
885 # Convert BytesIO or file objects passed with an encoding
FileNotFoundError: [Errno 2] No such file or directory: '../data/MLVS-PT.parquet'
Train / val / test splits#
splits_df = pd.DataFrame(
{
"Split": ["Train", "Validation", "Test"],
"Period": ["before 2020-01-01", "2020-01-01-2020-06-30", "2020-07-01 onwards"],
"Purpose": ["Model learning", "Early-stopping / overfitting guard", "Final honest evaluation"],
}
)
twiga_gt(
GT(splits_df)
.tab_header(title=md("**Data Splits**"), subtitle="Chronological - no shuffling, no overlap")
.cols_label(**{c: md(f"**{c}**") for c in splits_df.columns})
.tab_source_note("Twiga Forecast"),
n_rows=len(splits_df),
)
train_df = data[data["timestamp"] < "2020-01-01"].reset_index(drop=True)
val_df = data[(data["timestamp"] >= "2020-01-01") & (data["timestamp"] < "2020-07-01")].reset_index(drop=True)
test_df = data[data["timestamp"] >= "2020-07-01"].reset_index(drop=True)
log.info(
f"train : {train_df.shape[0]:,} rows ({train_df['timestamp'].min().date()} → {train_df['timestamp'].max().date()})"
)
log.info(f"val : {val_df.shape[0]:,} rows ({val_df['timestamp'].min().date()} → {val_df['timestamp'].max().date()})")
log.info(
f"test : {test_df.shape[0]:,} rows ({test_df['timestamp'].min().date()} → {test_df['timestamp'].max().date()})"
)
2. Data config#
DataPipelineConfig is identical to previous notebooks - same target, resolution, location, features, and horizon. This ensures that all models are evaluated on the exact same problem setup.
from sklearn.preprocessing import RobustScaler, StandardScaler
from twiga.core.config import DataPipelineConfig, ForecasterConfig
data_config = DataPipelineConfig(
target_feature="NetLoad(kW)",
period="30min",
latitude=32.371666,
longitude=-16.274998,
calendar_features=["hour", "day_night"],
exogenous_features=["Ghi"],
forecast_horizon=48,
stride=48,
lookback_window_size=48,
input_scaler=StandardScaler(),
target_scaler=RobustScaler(),
)
train_config = ForecasterConfig(project_name="neural-network-tutorial")
data_config
3. NN Config Dimensions: Auto-populated#
ML model configs (e.g. LIGHTGBMConfig) need no knowledge of input shape - the library infers it at training time.
NN configs are different. A neural network must know:
num_target_feature: number of target variables to forecastforecast_horizon: how many steps ahead to predictlookback_window_size: length of the input sequencenum_historical_features,num_calendar_features,num_exogenous_features,num_future_covariates
In the current API, all of these default to 0 and are auto-populated by TwigaForecaster from DataPipelineConfig.
You simply construct the config with any training hyperparameters you want to override:
from twiga.models.nn import MLPFConfig
# Dims are filled automatically - just set training knobs
mlpf_config = MLPFConfig(max_epochs=5, rich_progress_bar=False)
The legacy MLPFConfig.from_data_config(data_config) class method still works and is useful when you need a standalone config object outside a TwigaForecaster (e.g., inspection or debugging).
Key concept - sequence embedding and the (B, L, F) tensor
Every Twiga NN model receives inputs as a 3-D tensor of shape (B, L, F):
Axis
Meaning
Example (this notebook)
B
Batch size - number of windows processed simultaneously
32 windows
L
Lookback length - time steps in the input sequence
96 steps = 48 h
F
Feature count - target + calendar + exogenous features per step
~5 features
Before entering the MLP encoder, each time step can optionally be projected into a richer latent space via a value embedding (e.g.
LinearEmbapplies a shared linear layer across all L steps), and positional information is injected via a positional embedding (e.g.LearnPosEmbadds a trained vector to each position). Section 8 of this notebook shows how to toggle these with config knobs.
4. MLPF: Plain MLP backbone#
MLPF (MLP Fusion) is the baseline neural architecture. It encodes past and future covariates with separate MLP branches and fuses them via attention, weighted sum, or addition before predicting the horizon.
Key config knob: combination_type ("attn-comb" | "weighted-comb" | "addition-comb")
Key concept - Lightning training loop
Twiga NN models are trained with PyTorch Lightning, which manages the boilerplate (device placement, gradient steps, logging) so you only set high-level knobs:
max_epochs- how many full passes over the training set. More epochs = more learning time, but also more risk of overfitting. In production, 50 - 200 epochs is typical.Early stopping - Lightning monitors the validation loss after each epoch. If it does not improve for
patienceconsecutive epochs the run terminates early, saving the best checkpoint automatically.Checkpoint - the weights at the epoch with the lowest validation loss are saved to
checkpoints/<project_name>/<model>/best_*.ckpt. You can reload them at any time (Section 9).
rich_progress_bar=False- we disable the progress bar here to keep notebook output clean. Set it toTrueto watch loss values epoch-by-epoch during development.
from twiga import TwigaForecaster
from twiga.models.nn import MLPFConfig
mlpf_config = MLPFConfig(max_epochs=5, rich_progress_bar=False)
forecaster_mlpf = TwigaForecaster(
data_params=data_config,
model_params=[mlpf_config],
train_params=train_config,
)
forecaster_mlpf.fit(train_df=train_df, val_df=val_df)
clear_output()
pred_mlpf, metric_mlpf = forecaster_mlpf.evaluate_point_forecast(test_df=test_df)
log.info("MLPF-mean metrics across folds:")
def get_metric_table(metric_df):
res = metric_df.groupby("Model")[["mae", "corr", "nbias", "rmse", "wmape", "smape"]].mean().round(2).reset_index()
res = res.rename(
columns={"mae": "MAE", "corr": "Corr", "wmape": "WMAPE", "smape": "SMAPE", "nbias": "NBIAS", "rmse": "RMSE"}
)
metric_name = ["MAE", "Corr", "SMAPE", "RMSE"]
minimize_cols = ["MAE", "SMAPE", "RMSE"]
maximize_cols = ["Corr"]
return twiga_report(res, metric_name, minimize_cols, maximize_cols)
get_metric_table(metric_mlpf)
Reading the MLPF metrics
With max_epochs=5, MLPF is severely under-trained. A Pearson correlation near 0 means the forecast is almost uncorrelated with the actuals - the model has not yet learned the daily cycle. This is expected at 5 epochs and will improve significantly with more training. Use it as a baseline lower bound, not a performance ceiling.
5. MLPGAM: MLP + Group Additive Model#
MLPGAM augments the MLP backbone with a Additive Model (GAM) branch. The GAM branch learns per-feature additive effects (similar in spirit to classical GAMs), which are then combined with the MLP’s global representation. This often improves interpretability and generalisation on structured tabular-temporal data.
The key addition is a Lasso penalty on the final projection weights, encouraging sparse feature selection.
from twiga.models.nn import MLPGAMConfig
mlpgam_config = MLPGAMConfig(max_epochs=5, rich_progress_bar=False)
forecaster_mlpgam = TwigaForecaster(
data_params=data_config,
model_params=[mlpgam_config],
train_params=train_config,
)
forecaster_mlpgam.fit(train_df=train_df, val_df=val_df)
clear_output()
pred_mlpgam, metric_mlpgam = forecaster_mlpgam.evaluate_point_forecast(test_df=test_df)
clear_output()
get_metric_table(metric_mlpgam)
6. MLPGAF: MLP + Group Additive Neural Forecast#
MLPGAF replaces the simple combination step with a Gated Attention Fusion (GAF) mechanism. A learned gate decides, for each position and feature group, how much weight to give to each input stream. This can capture non-linear inter-feature interactions that the plain MLP fusion misses.
from twiga.models.nn import MLPGAFConfig
mlpgaf_config = MLPGAFConfig(max_epochs=5, rich_progress_bar=False)
forecaster_mlpgaf = TwigaForecaster(
data_params=data_config,
model_params=[mlpgaf_config],
train_params=train_config,
)
forecaster_mlpgaf.fit(train_df=train_df, val_df=val_df)
clear_output()
pred_mlpgaf, metric_mlpgaf = forecaster_mlpgaf.evaluate_point_forecast(test_df=test_df)
clear_output()
get_metric_table(metric_mlpgaf)
7. N-HiTS: Hierarchical interpolation#
N-HiTS (Neural Hierarchical Interpolation for Time Series) decomposes the forecast horizon into multiple scales using a stack of MLP blocks, each operating at a different temporal resolution. Long-range trends are captured by blocks with low sampling rates; short-range patterns by blocks with high sampling rates. The outputs are summed (residual connections) to produce the final forecast.
from twiga.models.nn import NHITSConfig
nhits_config = NHITSConfig(max_epochs=5, rich_progress_bar=False)
forecaster_nhits = TwigaForecaster(
data_params=data_config,
model_params=[nhits_config],
train_params=train_config,
)
forecaster_nhits.fit(train_df=train_df, val_df=val_df)
clear_output()
pred_nhits, metric_nhits = forecaster_nhits.evaluate_point_forecast(test_df=test_df)
clear_output()
get_metric_table(metric_nhits)
Reading the N-HiTS metrics
N-HiTS typically converges faster than plain MLPF because its multi-scale decomposition provides an implicit curriculum: coarse blocks learn long-range trends early while fine-grained blocks refine short-term patterns. Even at 5 epochs you should see a non-trivial correlation and an MAE well below the naive mean forecast. MLPGAM and N-HiTS are the recommended starting architectures for new energy datasets.
RNN#
from twiga.models.nn import RNNConfig
rnn_config = RNNConfig(max_epochs=5, rich_progress_bar=False)
forecaster_rnn = TwigaForecaster(
data_params=data_config,
model_params=[rnn_config],
train_params=train_config,
)
forecaster_rnn.fit(train_df=train_df, val_df=val_df)
clear_output()
pred_rnn, metric_rnn = forecaster_rnn.evaluate_point_forecast(test_df=test_df)
clear_output()
get_metric_table(metric_rnn)
9. Embedding options#
All MLP-based configs expose two embedding knobs that control how raw input features are represented before entering the network. These are config-level settings - no architecture changes required.
from great_tables import GT, md
import pandas as pd
from twiga.core.plot.gt import twiga_gt
value_emb_df = pd.DataFrame(
{
"value_embed_type": ['"LinearEmb"', '"ConvEmb"', '"PatchEmb"', "None"],
"What it does": [
"Simple linear projection per time step",
"1-D convolution — captures local temporal patterns",
"Splits the sequence into non-overlapping patches (requires patch_len and stride)",
"No value embedding; raw features passed directly",
],
}
)
pos_emb_df = pd.DataFrame(
{
"embedding_type": ['"LearnPosEmb"', '"RotaryEmb"', '"TimeEmb"', "None"],
"What it does": [
"Learnable positional encoding (trained end-to-end)",
"Rotary positional encoding (RoPE) — encodes relative positions",
"Projects calendar/time features into embedding space",
"No positional encoding",
],
}
)
print("Value embedding (value_embed_type)")
display(
twiga_gt(
GT(value_emb_df)
.tab_header(title=md("**Value Embedding**"), subtitle="How raw features are projected before the MLP encoder")
.cols_label(**{c: md(f"**{c}**") for c in value_emb_df.columns})
.tab_source_note("Twiga Forecast"),
n_rows=len(value_emb_df),
)
)
print("Positional embedding (embedding_type)")
display(
twiga_gt(
GT(pos_emb_df)
.tab_header(
title=md("**Positional Embedding**"), subtitle="How position information is injected into the sequence"
)
.cols_label(**{c: md(f"**{c}**") for c in pos_emb_df.columns})
.tab_source_note("Twiga Forecast"),
n_rows=len(pos_emb_df),
)
)
from twiga.models.nn import MLPGAMConfig
mlpgam_config_emb = MLPGAMConfig(
max_epochs=5,
rich_progress_bar=False,
value_embed_type="LinearEmb",
embedding_type="LearnPosEmb",
)
log.info("value_embed_type : %s", mlpgam_config_emb.value_embed_type)
log.info("embedding_type : %s", mlpgam_config_emb.embedding_type)
train_config.project_name = "MLPGAF_Linear_emb"
forecaster_mlpgam_emb = TwigaForecaster(
data_params=data_config,
model_params=[mlpgam_config_emb],
train_params=train_config,
)
forecaster_mlpgam_emb.fit(train_df=train_df, val_df=val_df)
clear_output()
pred_mlpgam_emb, metric_mlpgam_emb = forecaster_mlpgam_emb.evaluate_point_forecast(test_df=test_df)
clear_output()
get_metric_table(metric_mlpgam_emb)
9. Checkpoint loading#
After training, Twiga saves a Lightning checkpoint automatically. You can reload it at any time - this is useful when you want to avoid retraining or when resuming a session.
forecaster_mlpf.model.load_checkpoint()
log.info("Model in eval mode: %s", not forecaster_mlpf.model.model.training)
10. Results comparison: all NN models#
Concatenate per-fold metrics from each model and display a unified comparison table.
# Assign human-readable model labels
metric_mlpgam_emb["Model"] = "MLPGAM+Emb"
all_metrics = pd.concat(
[metric_mlpf, metric_mlpgam, metric_mlpgaf, metric_rnn, metric_nhits, metric_mlpgam_emb],
ignore_index=True,
)
get_metric_table(all_metrics)
Forecast plot: first 7 days of test set#
all_preds = pd.concat([pred_mlpf, pred_mlpgam, pred_nhits, pred_mlpgaf, pred_rnn], ignore_index=True)
p = plot_forecast_grid(
all_preds,
actual_col="Actual",
forecast_col="forecast",
model_col="Model",
n_samples_per_model=7 * 48,
y_label="Net Load (kW)",
title="NN point forecasts — first 7 days of test set (max_epochs=5)",
fig_width=920,
panel_height=400,
)
p
11. NN vs ML comparison#
With only 5 epochs, NNs may not outperform a well-tuned gradient boosting model. The cell below adds a LightGBM baseline so you can see the gap and understand how much more training the NNs would need.
With
max_epochs >= 50and a proper learning-rate schedule, the NN models typically match or surpass tree-based models on this dataset.
from twiga.models.ml import LIGHTGBMConfig
lgbm_config = LIGHTGBMConfig()
forecaster_lgbm = TwigaForecaster(
data_params=data_config,
model_params=[lgbm_config],
train_params=train_config,
)
forecaster_lgbm.fit(train_df=train_df, val_df=val_df)
clear_output()
pred_lgbm, metric_lgbm = forecaster_lgbm.evaluate_point_forecast(test_df=test_df)
clear_output()
get_metric_table(metric_lgbm)
Wrapping up#
What you did
Understood when and why neural networks beat gradient boosting (and vice versa)
Learned the (B, L, F) tensor format consumed by all Twiga NN models
Trained four NN architectures - MLPF, MLPGAM, MLPGAF, N-HiTS - using the Lightning training loop
Configured value embeddings (
LinearEmb,ConvEmb,PatchEmb) and positional embeddings (LearnPosEmb,RotaryEmb) via config knobsCompared NN vs ML (LightGBM) performance on the same test data and understood the epoch-budget trade-off
Reloaded a saved Lightning checkpoint without retraining
Key takeaways
At low epoch counts, gradient boosting almost always wins - NNs need sufficient training time to beat tabular baselines.
MLPGAM and N-HiTS tend to converge faster than plain MLPF; they are the recommended starting architectures for energy data.
All NN dims (
forecast_horizon,lookback_window_size, feature counts) are auto-populated byTwigaForecaster- you only need to set training hyperparameters in the config.Lightning checkpointing is automatic: the best-validation-loss weights are saved and can be reloaded with
.load_checkpoint().Value and positional embeddings are orthogonal knobs - experiment with them independently before combining.
What’s next?#
07 - Quantile Regression - Add probabilistic outputs to your forecasts by training QR-LightGBM, QR-XGBoost, and FPQR models that produce calibrated prediction intervals instead of single point values.
# ruff: noqa: E501, E701, E702
from IPython.display import HTML
_TEAL = "#107591"
_TEAL_MID = "#069fac"
_TEAL_LIGHT = "#e8f5f8"
_TEAL_BEST = "#d0ecf1"
_TEXT_DARK = "#2d3748"
_TEXT_MUTED = "#718096"
_WHITE = "#ffffff"
steps = [
{
"num": "05",
"title": "ML Point Forecasting",
"desc": "CatBoost · XGBoost · LightGBM — ML baseline to beat",
"tags": ["catboost", "xgboost", "lightgbm"],
"active": False,
},
{
"num": "06",
"title": "Backtesting & Evaluation",
"desc": "Rolling-window backtesting · fold-level metrics",
"tags": ["backtesting", "evaluation"],
"active": False,
},
{
"num": "07",
"title": "Neural Networks",
"desc": "MLPF · N-HiTS · Lightning training · sequence embeddings",
"tags": ["neural network", "pytorch", "lightning"],
"active": True,
},
{
"num": "08",
"title": "Quantile Regression",
"desc": "First probabilistic step — prediction intervals",
"tags": ["probabilistic", "quantile", "intervals"],
"active": False,
},
{
"num": "09",
"title": "Parametric Distributions",
"desc": "Normal · Laplace · Gamma heads — NLL training",
"tags": ["parametric", "NLL", "distributions"],
"active": False,
},
]
track_name = "Neural Network Track"
footer = 'Next: add uncertainty to your NN with <span style="color:#107591;font-weight:600;">Quantile Regression</span> (08) or <span style="color:#107591;font-weight:600;">Parametric Distributions</span> (09).'
def _b(t, bg, fg):
return f'<span style="display:inline-block;background:{bg};color:{fg};font-size:10px;font-weight:600;padding:2px 7px;border-radius:10px;margin:2px 2px 0 0;">{t}</span>'
ch = ""
for i, s in enumerate(steps):
a = s["active"]
cb = _TEAL if a else _WHITE
cbo = _TEAL if a else "#d1ecf1"
nb = _TEAL_MID if a else _TEAL_LIGHT
nf = _WHITE if a else _TEAL
tf = _WHITE if a else _TEXT_DARK
df = "#cce8ef" if a else _TEXT_MUTED
bb = "#0d5f75" if a else _TEAL_BEST
bf = "#b8e4ed" if a else _TEAL
yh = (
f'<span style="float:right;background:{_TEAL_MID};color:{_WHITE};font-size:10px;font-weight:700;padding:2px 10px;border-radius:12px;">★ you are here</span>'
if a
else ""
)
bdg = "".join(_b(t, bb, bf) for t in s["tags"])
ch += f'<div style="background:{cb};border:2px solid {cbo};border-radius:12px;padding:16px 20px;display:flex;align-items:flex-start;gap:16px;box-shadow:{"0 4px 14px rgba(16,117,145,.25)" if a else "0 1px 4px rgba(0,0,0,.06)"};"><div style="min-width:44px;height:44px;background:{nb};color:{nf};border-radius:50%;display:flex;align-items:center;justify-content:center;font-size:15px;font-weight:800;flex-shrink:0;">{s["num"]}</div><div style="flex:1;"><div style="font-size:15px;font-weight:700;color:{tf};margin-bottom:4px;">{s["title"]}{yh}</div><div style="font-size:12.5px;color:{df};margin-bottom:8px;line-height:1.5;">{s["desc"]}</div><div>{bdg}</div></div></div>'
if i < len(steps) - 1:
ch += f'<div style="display:flex;justify-content:center;height:32px;"><svg width="24" height="32" viewBox="0 0 24 32" fill="none"><line x1="12" y1="0" x2="12" y2="24" stroke="{_TEAL_MID}" stroke-width="2" stroke-dasharray="4 3"/><polygon points="6,20 18,20 12,30" fill="{_TEAL_MID}"/></svg></div>'
HTML(
f'<div style="font-family:Inter,\'Segoe UI\',sans-serif;max-width:640px;margin:8px 0;"><div style="background:linear-gradient(135deg,{_TEAL} 0%,{_TEAL_MID} 100%);border-radius:12px 12px 0 0;padding:14px 20px;display:flex;align-items:center;gap:10px;"><svg width="22" height="22" viewBox="0 0 24 24" fill="none" stroke="{_WHITE}" stroke-width="2"><path d="M12 2L2 7l10 5 10-5-10-5z"/><path d="M2 17l10 5 10-5"/><path d="M2 12l10 5 10-5"/></svg><span style="color:{_WHITE};font-size:14px;font-weight:700;">Twiga Learning Path — {track_name}</span></div><div style="border:2px solid {_TEAL_LIGHT};border-top:none;border-radius:0 0 12px 12px;padding:20px 20px 16px;background:#f9fdfe;display:flex;flex-direction:column;">{ch}<div style="margin-top:16px;font-size:11.5px;color:{_TEXT_MUTED};text-align:center;border-top:1px solid {_TEAL_LIGHT};padding-top:12px;">{footer}</div></div></div>'
)