Experiment Tracking#
Source file
twiga/tracking/tracker.py-TwigaTracker
Overview#
TwigaTracker is a context manager that wraps an MLflow run lifecycle and exposes
helpers tuned to Twiga’s data structures - Pydantic configs, metrics DataFrames, and
checkpoint directories. It is installed as part of the [mlops] extra.
pip install twiga[mlops]
Quick start#
from twiga.tracking import TwigaTracker
with TwigaTracker(experiment="load-forecast", run_name="lgbm-v1") as tracker:
forecaster.fit(train_df, val_df)
_, metrics_df = forecaster.evaluate(test_df)
tracker.log_metrics(metrics_df)
tracker.log_forecaster(forecaster)
The block above:
Creates (or resumes) the MLflow experiment
"load-forecast".Starts a new run named
"lgbm-v1".Logs evaluation metrics keyed as
{model}/{metric}.Logs data pipeline params and checkpoint artefacts.
Ends the run with status
FINISHEDon exit (orFAILEDon exception).
API reference#
TwigaTracker#
- class twiga.tracking.TwigaTracker(experiment='twiga', run_name=None, tracking_uri=None, tags=None, system_metrics=False)#
Bases:
objectContext manager for tracking Twiga experiments in MLflow.
Provides high-level utilities for logging metadata, models, and evaluation results while maintaining strict MLOps standards for lineage and reproducibility.
- __init__(experiment='twiga', run_name=None, tracking_uri=None, tags=None, system_metrics=False)#
Initializes the tracker and sets the tracking URI.
- Parameters:
experiment (
str) – Name of the MLflow experiment.run_name (
str|None) – Optional name for this specific run.tracking_uri (
str|None) – Optional URI for the MLflow tracking server.tags (
dict[str,str] |None) – Initial tags to attach to the run.system_metrics (
bool) – Enable MLflow system-metrics monitoring (CPU/RAM/GPU). Disabled by default to avoid noisy log output for short runs.
- log_dataset(df, name, source='unknown', context='training')#
Logs dataset lineage for reproducibility.
- log_evaluation(metrics_df, results_df)#
Logs summary metrics and interactive evaluation tables.
- log_forecaster_metadata(forecaster)#
Harvests all available configurations and hyperparameters.
Consolidates Pydantic configs and top-level primitive attributes into MLflow parameters.
- Return type:
- log_model(forecaster, sample_input)#
Logs the entire forecaster as a PyFunc model with environment dependencies.
- Parameters:
forecaster (
TwigaForecaster) – The fitted TwigaForecaster.sample_input (
DataFrame) – Sample data used to infer the model schema/signature.
- Return type:
Logging helpers#
Parameters from Pydantic configs#
log_config uses model_dump() and flattens nested dicts with dot-separated
keys so that all Pydantic config fields appear as searchable MLflow parameters:
tracker.log_config(forecaster.data_pipeline, prefix="data")
# → logs data.target_feature, data.forecast_horizon, data.lookback_window_size, …
Metrics from a DataFrame#
log_metrics reads the standard Twiga evaluation DataFrame (columns include
mae, rmse, corr, Model). Each model gets its own metric namespace:
tracker.log_metrics(metrics_df)
# → logs lgbm/mae, lgbm/rmse, catboost/mae, catboost/rmse, …
Full forecaster log#
log_forecaster is a convenience wrapper that logs:
What |
MLflow path |
|---|---|
|
params |
|
params |
Checkpoint directory |
|
tracker.log_forecaster(forecaster)
ExperimentEngine tracking#
ExperimentEngine automatically logs to MLflow when
a tracking URI is configured. No code changes are needed — just set the URI before
running any experiment script.
Recommended backend: SQLite#
The MLflow file-based backend (file://) works for artifact storage but the
newer MLflow UI (2.x+) requires a SQL store for full functionality (run tables,
comparisons, search). Use SQLite — it is still local, requires no server, and
is a single file:
# Set once in your shell profile or .env
export MLFLOW_TRACKING_URI=sqlite:///$(pwd)/mlruns.db
Or pass it per-run via --tracking-uri:
uv run python experiment/mlgaf_ablation.py \
--group gating --dataset MLVS-PT \
--tracking-uri sqlite:///$(pwd)/mlruns.db
Starting the MLflow UI#
After running one or more experiments, start the UI in a separate terminal:
uv run mlflow ui \
--backend-store-uri sqlite:///mlruns.db \
--port 5000
Then open http://localhost:5000 in your browser. The experiment named after
spec.name (e.g. MLPGAF Regularization Ablation) appears in the left sidebar
under Experiments. Click it and go to the Runs tab to see all runs.
To keep the UI running in the background:
uv run mlflow ui --backend-store-uri sqlite:///mlruns.db --port 5000 &
Run hierarchy#
Each engine.run() call writes a three-level hierarchy:
MLflow Experiment: <spec.name>
└─ Parent Run: <YYYYMMDD_HHMMSS_<git-hash>>
├─ hpo/<dataset>/<model> ← backbone HPO (one per model/dataset)
└─ <dataset>/<group>/<condition> ← one per ablation condition
└─ fold_1, fold_2, … ← per-fold NN training runs
The parent run logs CV split settings and a results/ artifact with the
cross-condition summary CSV. Each condition child run logs mean ± std metrics
across folds and the effective model config params.
Disabling tracking#
Omit the env var and do not pass --tracking-uri — the engine detects that no
URI is configured and runs without any MLflow calls.
Using a remote tracking server#
Pass tracking_uri to point at a remote MLflow instance:
with TwigaTracker(
experiment="solar-forecast",
tracking_uri="http://mlflow.internal:5000",
) as tracker:
...
Without tracking_uri, MLflow defaults to a local mlruns/ directory in the
working directory.
Prefect integration#
Inside a Prefect flow, the log_to_mlflow task wraps TwigaTracker so you
do not need to call it directly:
from twiga.pipeline import training_flow
training_flow(
forecaster=forecaster,
data_path="data/load.parquet",
experiment="load-forecast",
run_name="lgbm-v1",
)
See Pipeline for the complete flow reference.