Conformal Prediction & Uncertainty Quantification#
Source Files
twiga/distributions/conformal/core.pytwiga/distributions/conformal/base.pytwiga/distributions/conformal/cqr.pytwiga/distributions/conformal/residual_conformaltwiga/core/config/base.py
Conformal prediction provides distribution-free prediction intervals with finite-sample coverage guarantees. Unlike Bayesian methods, conformal prediction makes no assumptions about the data distribution - it only requires exchangeability of calibration data.
Because calibration needs only a set of predictions and ground truth from a held-out split, conformal prediction wraps any trained Twiga model without retraining - including plain point-forecast ML models like CatBoost or XGBoost. This makes it the recommended path for adding uncertainty estimates to an existing point forecaster.
How It Works#
graph LR
A[Train Model] --> B[Calibrate on Held-Out Data]
B --> C[Compute Non-Conformity Scores]
C --> D["Calculate Threshold q̂ at (1-α) quantile"]
D --> E[Generate Intervals on New Data]
E --> F["[prediction - q̂, prediction + q̂]"]
Train the model on training data
Calibrate using held-out data to compute non-conformity scores
Calculate threshold \(\hat{q}\) as the \((1-\alpha)\)-quantile of the scores
Generate intervals on new data using the calibrated threshold
The resulting intervals have guaranteed marginal coverage of at least \(1-\alpha\).
Class Hierarchy#
classDiagram
class BaseConformal {
<<abstract>>
+alpha: float
+q_hat: float | ndarray
+calibrate(*calib_args, axis)
+calculate_conformal_quantile(scores, axis)
+get_scores()*
+generate_intervals()*
}
class SplitConformal {
+score_type: "res" | "sign-res"
+get_scores(predicts, targets)
+generate_intervals(mu_pred)
}
class ConformalQuantileRegressor {
+score_type: "scaled" | "unscaled"
+get_scores(lower_q, upper_q, targets)
+generate_intervals(lower_q, upper_q)
}
class ConformalResidualFitting {
+score_type: "res" | "sign-res"
+get_scores(predicts, sigma, targets)
+generate_intervals(loc, sigma)
}
class Conformal {
<<factory>>
+__new__(method, score_type, alpha)
}
BaseConformal <|-- SplitConformal
BaseConformal <|-- ConformalQuantileRegressor
BaseConformal <|-- ConformalResidualFitting
Conformal ..> SplitConformal : creates
Conformal ..> ConformalQuantileRegressor : creates
Conformal ..> ConformalResidualFitting : creates
Factory: Conformal#
The Conformal class in twiga/distributions/conformal/core.py is a factory that selects the appropriate method:
from twiga.distributions.conformal.core import Conformal
# Creates a SplitConformal instance
cp = Conformal(method="residual", score_type="res", alpha=0.1)
# Creates a ConformalQuantileRegressor instance
cqr = Conformal(method="quantile", score_type="scaled", alpha=0.1)
# Creates a ConformalResidualFitting instance
crf = Conformal(method="residual-fitting", score_type="res", alpha=0.1)
Configuration#
Conformal prediction is configured via ConformalConfig:
from twiga.core.config import ConformalConfig
config = ConformalConfig(
method="residual", # "residual", "quantile", "residual-fitting"
score_type="res", # method-dependent (see table below)
alpha=0.1, # significance level (0 < alpha < 1)
)
Parameter |
Type |
Constraints |
Description |
|---|---|---|---|
|
|
Required |
Conformal prediction method |
|
|
Required |
Non-conformity score type |
|
|
|
Significance level (\(1-\alpha\) = target coverage) |
Valid Score Types per Method#
Method |
Class |
Valid Score Types |
Description |
|---|---|---|---|
|
|
|
Absolute or signed residuals |
|
|
|
CQR with or without scaling |
|
|
|
Scale-adapted residuals |
Methods#
Split Conformal (method="residual")#
The simplest method. Computes non-conformity scores as residuals between predictions and targets.
Score types:
"res": \(s_i = |y_i - \hat{y}_i|\) (absolute residuals)"sign-res": \(s_i = y_i - \hat{y}_i\) (signed residuals)
Intervals: \([\hat{y} - \hat{q}, \; \hat{y} + \hat{q}]\)
from twiga.distributions.conformal.base import SplitConformal
cp = SplitConformal(score_type="res", alpha=0.1)
cp.calibrate(predictions, targets)
lower, upper = cp.generate_intervals(new_predictions)
Tip
Split conformal works with any point prediction model and is the easiest to set up. Use this as a starting point.
Conformal Quantile Regression (method="quantile")#
Requires a quantile regression model that produces lower and upper quantile predictions.
Score types:
"unscaled": \(s_i = \max(q_{lo,i} - y_i, \; y_i - q_{hi,i})\)"scaled": \(s_i = \max\left(\frac{q_{lo,i} - y_i}{q_{hi,i} - q_{lo,i}}, \; \frac{y_i - q_{hi,i}}{q_{hi,i} - q_{lo,i}}\right)\)
Intervals (unscaled): \([q_{lo} - \hat{q}, \; q_{hi} + \hat{q}]\)
Intervals (scaled): \([q_{lo} - \hat{q} \cdot (q_{hi} - q_{lo}), \; q_{hi} + \hat{q} \cdot (q_{hi} - q_{lo})]\)
from twiga.distributions.conformal.cqr import ConformalQuantileRegressor
cqr = ConformalQuantileRegressor(score_type="scaled", alpha=0.1)
cqr.calibrate(lower_quantile, upper_quantile, targets)
lower, upper = cqr.generate_intervals(lower_quantile_new, upper_quantile_new)
Note
Scaled CQR adapts interval width based on the model’s quantile spread, producing narrower intervals where the model is more confident. Use with quantile regression models.
Conformal Residual Fitting (method="residual-fitting")#
Requires a model that produces both a point prediction (\(\mu\)) and a scale estimate (\(\sigma\)).
Score types:
"res": \(s_i = \frac{|\hat{y}_i - y_i|}{\sigma_i}\) (absolute)"sign-res": \(s_i = \frac{\hat{y}_i - y_i}{\sigma_i}\) (signed, can be negative)
Intervals: \([\mu - \sigma \cdot \hat{q}, \; \mu + \sigma \cdot \hat{q}]\)
from twiga.distributions.conformal.crc import ConformalResidualFitting
crf = ConformalResidualFitting(score_type="res", alpha=0.1)
crf.calibrate(predictions, sigma, targets)
lower, upper = crf.generate_intervals(new_predictions, new_sigma)
Note
Use residual-fitting with probabilistic models like GAUSSCATBOOSTModel or neural models with sigma outputs like MLPGAFModel.
Integration with TwigaForecaster#
The TwigaForecaster manages conformal prediction end-to-end:
from twiga.core.config import (
ConformalConfig, DataPipelineConfig, ForecasterConfig,
)
from twiga.forecaster.core import TwigaForecaster
from twiga.models.ml.xgboost_model import XGBOOSTConfig
# 1. Configure with conformal prediction
conformal_config = ConformalConfig(
method="residual",
score_type="res",
alpha=0.1,
)
forecaster = TwigaForecaster(
data_params=data_config,
model_params=[XGBOOSTConfig()],
train_params=train_config,
conformal_params=conformal_config,
)
# 2. Train
forecaster.fit(train_df=train_df, val_df=val_df)
# 3. Calibrate on held-out data
forecaster.calibrate(calibrate_df=calibration_df)
# 4. Generate prediction intervals
interval_dict, times = forecaster.predict_interval(test_df=test_df)
for model_name, (lower, point, upper) in interval_dict.items():
print(f"{model_name}: coverage target = {1 - conformal_config.alpha:.0%}")
# 5. Evaluate interval quality
predictions_df, metrics_df = forecaster.evaluate_interval_forecast(test_df=test_df)
# metrics_df includes: picp (coverage), winkle-score, ace, nmpi, cwe
Calibration Flow#
sequenceDiagram
participant User
participant TF as TwigaForecaster
participant Conf as Conformal Factory
participant Model
User->>TF: calibrate(calibrate_df)
TF->>Model: predict(calibrate_df)
Model-->>TF: predictions
TF->>TF: get_ground_truth()
loop For each model
TF->>Conf: Conformal(method, score_type, alpha)
Conf-->>TF: conformal_instance
TF->>Conf: calibrate(predictions, targets)
Conf->>Conf: get_scores() → calculate_conformal_quantile()
Conf-->>TF: calibrated (q_hat set)
TF->>TF: store in self.conformal[model_name]
end
API Reference#
- class twiga.distributions.conformal.core.Conformal(method, score_type='res', alpha=0.1)#
Bases:
objectFactory class to create method-specific conformal predictors.
- class twiga.distributions.conformal.base.BaseConformal(alpha=0.1)#
Bases:
ABCAbstract base class for Conformal Prediction in Regression.
Provides core functionality for constructing prediction intervals with finite-sample coverage guarantees.
- Variables:
- Parameters:
alpha (
float) – Significance level (0 < alpha < 1). Defaults to 0.1.- Raises:
ValueError – If alpha is not in (0, 1).
- __init__(alpha=0.1)#
Initializes base conformal predictor with validation.
- calculate_conformal_quantile(scores, axis=0)#
Computes (1-alpha)-adjusted quantile of non-conformity scores.
Implements conformal quantile adjustment from Lei et al. (2017).
- Parameters:
- Return type:
- Returns:
Quantile values for interval construction
- Raises:
ValueError – For empty scores array
- calibrate(*calib_args, axis=0)#
Calibrates conformal thresholds using provided arguments.
- Parameters:
*calib_args – Implementation-specific calibration data
axis (
int) – Axis for quantile computation. Defaults to 0.
- Raises:
ValueError – If calibration data validation fails
- Return type:
- abstractmethod generate_intervals(*pred_args)#
Generates prediction intervals from inputs.
Must be implemented by concrete subclasses.
- class twiga.distributions.conformal.base.SplitConformal(score_type='res', alpha=0.1)#
Bases:
BaseConformalImplements residual-based conformal prediction for regression tasks.
- Variables:
score_type (str) – Type of non-conformity score (‘res’ or ‘sign-res’).
- generate_intervals(mu_pred)#
Generates prediction intervals.
- class twiga.distributions.conformal.cqr.ConformalQuantileRegressor(score_type='scaled', alpha=0.1)#
Bases:
BaseConformalConformal quantile regression for uncertainty estimation.
Implements conformal prediction intervals for quantile regression models using either scaled or unscaled non-conformity scores.
- Variables:
- Parameters:
- Raises:
ValueError – For invalid score_type or alpha values.
- generate_intervals(lower_quantile, upper_quantile)#
Generates calibrated prediction intervals.
- get_scores(lower_quantile, upper_quantile, targets)#
Computes non-conformity scores for quantile predictions.
- class twiga.distributions.conformal.crc.ConformalResidualFitting(score_type='res', alpha=0.1)#
Bases:
BaseConformalConformal residual fitting for uncertainty estimation.
Implements conformal prediction intervals using residual-based scoring with optional scale adaptation.
- Variables:
- Parameters:
- Raises:
ValueError – For invalid score_type or alpha values.
- __init__(score_type='res', alpha=0.1)#
Initializes residual fitter with validation.
- generate_intervals(loc, sigma)#
Generates calibrated prediction intervals.