Add online EWA and shrinkage covariance/precision estimators by MaxHalford · Pull Request #1923 · online-ml/river

MaxHalford · 2026-06-25T11:17:49Z

What

Adds a family of online covariance estimators to river.covariance, reimplemented from the precise package following River conventions (dict-native, Narwhals mini-batches), per discussion #1884. Only genuinely-online methods are included — nothing that lazily inverts a stored covariance on read.

New estimators (`river/covariance/ewa.py`)

EwaCovariance — exponentially weighted (RiskMetrics-style) covariance. Diagonal matches stats.EWVar, off-diagonals match stats.EWCov. For non-stationary streams whose relationships drift over time.
LedoitWolfCovariance / OASCovariance — data-driven shrinkage of the EWA covariance towards a scaled identity (Ledoit-Wolf 2004 / Chen et al. 2010 intensities). For high-dimensional / few-sample regimes where the raw covariance is noisy or singular.
ShrunkCovariance — fixed-intensity shrinkage with a finance-friendly constant-correlation target (or identity).
EwaPrecision — exponentially weighted precision (inverse covariance) maintained online via a forgetting-factor Sherman-Morrison update. Genuinely online (O(d²)/step, never inverts explicitly); the recency-weighted counterpart of EmpiricalPrecision.

Supporting additions

stats.EWCov — exponentially weighted covariance primitive (bivariate counterpart of stats.EWVar), composed from EWMeans so the convention matches exactly.
datasets.SP500Stocks — daily returns (in %) of ten large-cap S&P 500 stocks across diverse sectors (2013–2018, 1,257 trading days), used in the docstring examples. Bundled sp500.csv.gz.
Guards SymmetricMatrix.__repr__ against empty (unfitted) matrices (previously raised ValueError).

Design notes

Internals are array-backed with a feature→index map (the same pattern as the existing EmpiricalPrecision) behind a dict-native public interface (update, update_many, matrix, __getitem__).
The EWA convention reuses stats.EWMean/EWVar so the diagonal and off-diagonals are exactly the existing scalar EW statistics.
No precision equivalents for the shrinkage estimators: shrinking toward a scaled identity is a full-rank perturbation that can't be tracked by rank-one inverse updates, so a "shrunk precision" would require inverting on read — exactly the lazy approach this work excludes.

Tests

river/covariance/test_ewa.py: EWA vs independent numpy reference + stats.EWVar/EWCov; update_many ≡ single-update loop across all estimators and across pandas/polars backends; shrinkage vs sklearn.covariance oracle; EwaPrecision vs numpy oracle and P @ S ≈ I; PSD/symmetry invariants; pickling; empty-repr.
All doctests use the real SP500Stocks data. uv run mypy and ruff are clean.

🤖 Generated with Claude Code

…stimators Add a family of online covariance estimators reimplemented from the `precise` package, following River conventions (dict-native, Narwhals mini-batches) and excluding any lazy invert-on-read methods: - covariance.EwaCovariance: exponentially weighted covariance (RiskMetrics style); diagonal matches stats.EWVar, off-diagonals match stats.EWCov. - covariance.LedoitWolfCovariance / OASCovariance: data-driven shrinkage towards a scaled identity for high-dimensional / few-sample regimes. - covariance.ShrunkCovariance: fixed-intensity shrinkage with a constant-correlation (finance) or identity target. - covariance.EwaPrecision: exponentially weighted precision via a forgetting-factor Sherman-Morrison update; genuinely online, never inverts explicitly. Recency-weighted counterpart of EmpiricalPrecision. - stats.EWCov: exponentially weighted covariance primitive (bivariate counterpart of stats.EWVar). - datasets.SP500Stocks: daily returns of ten large-cap S&P 500 stocks (2013-2018), used in the docstring examples. Internals are array-backed with a feature->index map (like EmpiricalPrecision) behind a dict-native interface. Also guards SymmetricMatrix.__repr__ against empty (unfitted) matrices. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>

MaxHalford · 2026-06-25T11:19:13Z

@microprediction this PR pulls some of precise's methods into River. I'm very grateful for this gift!

Replace the circular EWCov test (which re-implemented the estimator's own E[xy]-E[x]E[y] recursion) with a comparison against pandas' ewm().cov(), and add a test comparing EmpiricalCovariance against sklearn's batch estimator. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>

Drop pytest.importorskip in favour of a plain inline sklearn import (matching the existing sklearn test), extract the _dense value helper to module level, and trim the EWCov comment. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>

…many narwhals-native Migrate the empirical estimators' `update_many` off the hard-coded pandas path (`.values`/`.columns`) to the `utils.dataframe` narwhals boundary helpers, matching the new EWA/shrinkage estimators and the rest of the #1919 migration. Any narwhals-supported eager dataframe (pandas, polars, pyarrow, ...) now flows through; the pandas path is byte-for-byte unchanged. Adds multi-backend tests via the `frame_backend` fixture. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>

MaxHalford requested review from AdilZouitine and smastelini as code owners June 25, 2026 11:17

MaxHalford and others added 3 commits June 25, 2026 20:14

MaxHalford mentioned this pull request Jun 26, 2026

Migrate all mini-batch (_many) methods to narwhals for dataframe-agnostic support #1919

Open

17 tasks

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Uh oh!

Add online EWA and shrinkage covariance/precision estimators#1923

Add online EWA and shrinkage covariance/precision estimators#1923
MaxHalford wants to merge 4 commits into
mainfrom
feat/online-covariance-estimators

MaxHalford commented Jun 25, 2026

Uh oh!

MaxHalford commented Jun 25, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Uh oh!

Uh oh!

Conversation

MaxHalford commented Jun 25, 2026

What

New estimators (river/covariance/ewa.py)

Supporting additions

Design notes

Tests

Uh oh!

MaxHalford commented Jun 25, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

New estimators (`river/covariance/ewa.py`)