ADR-0006: Lazy Metrics Pattern¶
- Status:
Accepted
- Date:
2026-01-01 (v1.5.0)
Context¶
Distribution fitting computes multiple goodness-of-fit metrics:
SSE: Sum of Squared Errors (always computed, fast)
AIC/BIC: Information criteria (always computed, fast)
KS: Kolmogorov-Smirnov statistic and p-value
AD: Anderson-Darling statistic
KS and AD statistics require evaluating the CDF at every data point, making them O(n) operations that dominate fitting time for large datasets:
Fitting 90 distributions with KS/AD: ~15 seconds
Fitting 90 distributions without KS/AD: ~3 seconds
Many workflows only need AIC/BIC for model selection, making KS/AD computation wasteful.
Decision¶
We implemented lazy metric computation via the lazy_metrics config option:
config = (FitterConfigBuilder()
.with_lazy_metrics(True)
.build())
Behavior:
When
lazy_metrics=False(default): KS/AD computed during parallel fitWhen
lazy_metrics=True: KS/AD deferred until accessed
Implementation (results.py):
class LazyFitResult(FitResult):
def __init__(self, ..., data_sample: np.ndarray):
self._data_sample = data_sample
self._ks_statistic: Optional[float] = None
self._ks_pvalue: Optional[float] = None
@property
def ks_statistic(self) -> float:
if self._ks_statistic is None:
self._compute_ks()
return self._ks_statistic
def _compute_ks(self) -> None:
# Compute on-demand using stored data sample
Class hierarchy (v2.1.0 refinement):
FitResult (base)
├── EagerFitResult # KS/AD computed upfront
└── LazyFitResult # KS/AD computed on access
Data lifecycle:
Lazy results store a reference to the data sample
After KS/AD computation, sample can be released via
release_data()Serialization includes computed values, not raw data
Consequences¶
Positive:
5x speedup for AIC/BIC-only workflows
No API change: properties still accessible, just deferred
Users only pay for metrics they actually use
Parallel fitting remains embarrassingly parallel
Negative:
Memory overhead: lazy results hold data sample until metrics accessed
First access to KS/AD incurs computation delay
Serialization of lazy results may include uncomputed metrics as None
Neutral:
Default is
lazy_metrics=Falsefor backwards compatibilityLazy results explicitly document their deferred behavior
References¶
PR #72: Lazy metrics (v1.5.0)
PR #92: Eager/lazy class hierarchy (v2.1.0)
Related: ADR-0003: Configuration System (FitterConfig)