online-ml · MaxHalford · Jun 25, 2026 · Jun 25, 2026 · Jun 25, 2026 · Jun 26, 2026
@@ -15,6 +15,9 @@
 
 ## covariance
 
+- Added `EwaCovariance`, `LedoitWolfCovariance`, `OASCovariance`, and `ShrunkCovariance`: online covariance estimators for non-stationary streams (exponentially weighted, recency-biased) and high-dimensional / few-sample regimes (shrinkage towards a well-conditioned target). They are dict-native like `EmpiricalCovariance` and support mini-batches via `update_many` on any [narwhals](https://github.com/narwhals-dev/narwhals)-supported eager backend.
+- Added `EwaPrecision`, an exponentially weighted precision (inverse covariance) matrix maintained online via a forgetting-factor Sherman-Morrison update. The recency-weighted counterpart of `EmpiricalPrecision`, useful for tracking Mahalanobis distances and Gaussian likelihoods on non-stationary streams.
+- `EmpiricalCovariance.update_many` and `EmpiricalPrecision.update_many` now accept any [narwhals](https://github.com/narwhals-dev/narwhals)-supported eager dataframe (pandas, polars, pyarrow, ...) instead of pandas only. Outputs are unchanged for the pandas path.
 - Added weighted sample support to `EmpiricalCovariance.update` and `EmpiricalCovariance.revert` by accepting an optional `w` parameter and propagating it to the underlying `stats.Cov` and `stats.Var` statistics.
 - Sped up `EmpiricalCovariance.update`/`revert` (~40% faster at 30 features) by caching the sorted feature list and pair iteration in the hot path. No semantic change.
 - Restructured `EmpiricalPrecision` around NumPy-backed dense state, removing the per-update dict ↔ numpy marshalling. ~7× faster on 2000 × 20 sample streams.
@@ -24,6 +27,7 @@
 
 - Added `datasets.CriteoAds`, a 100,000-row sample of the Criteo Display Advertising Challenge (binary click prediction with 13 integer and 26 high-cardinality categorical features). A natural fit for one-hot models such as `linear_model.AdPredictor`.
 - Added `datasets.Shuttle`, the UCI Statlog (Shuttle) dataset cast as a binary anomaly-detection task following the ODDS benchmark (49,097 observations, 9 numerical features, ~7% anomalies). Ships bundled with River.
+- Added `datasets.SP500Stocks`, daily returns (1,257 trading days, 2013-2018) for ten large-cap S&P 500 stocks across diverse sectors. A natural fit for the online covariance estimators in `river.covariance`.
 
 ## facto
 
@@ -91,6 +95,7 @@
 
 ## stats
 
+- Added `stats.EWCov`, an exponentially weighted covariance between two variables (the bivariate counterpart of `stats.EWVar`).
 - Added `stats.ChiSquared`, a streaming Chi-squared statistic between two categorical variables. Wrap it with `utils.Rolling` for a rolling version.
 
 ## stream

@@ -1,7 +1,34 @@
-"""Online estimation of covariance and precision matrices."""
+"""Online estimation of covariance and precision matrices.
+
+A covariance matrix summarises how a set of variables move together. It is the engine behind
+portfolio risk, anomaly detection (via the Mahalanobis distance), Gaussian models, and many
+dimensionality-reduction methods. This module estimates it (and its inverse, the precision
+matrix) incrementally from a stream, without storing the data. See each estimator's docstring for
+what it does and when to reach for it.
+
+The estimators are dict-native: `update(x)` takes a mapping and the `matrix` is a dict of pairwise
+values. Most also expose an `update_many` method for mini-batches of any narwhals-compatible
+dataframe.
+
+"""
 
 from __future__ import annotations
 
 from .emp import EmpiricalCovariance, EmpiricalPrecision
+from .ewa import (
+    EwaCovariance,
+    EwaPrecision,
+    LedoitWolfCovariance,
+    OASCovariance,
+    ShrunkCovariance,
+)
 
-__all__ = ["EmpiricalCovariance", "EmpiricalPrecision"]
+__all__ = [
+    "EmpiricalCovariance",
+    "EmpiricalPrecision",
+    "EwaCovariance",
+    "EwaPrecision",
+    "LedoitWolfCovariance",
+    "OASCovariance",
+    "ShrunkCovariance",
+]
@@ -9,7 +9,7 @@
 from river import stats, utils
 
 if typing.TYPE_CHECKING:
-    import pandas as pd
+    from narwhals.stable.v2.typing import IntoDataFrame
 
 
 class SymmetricMatrix(abc.ABC):
@@ -33,6 +33,8 @@ def __getitem__(self, key):
 
     def __repr__(self):
         names = sorted({i for i, _ in self.matrix})
+        if not names:
+            return f"{type(self).__name__} (empty)"
 
         headers = [""] + list(map(str, names))
         columns = [headers[1:]]
@@ -177,25 +179,30 @@ def revert(self, x: dict, w: float = 1.0):
         for i in keys:
             cov_dict[i, i].revert(x[i], w)
 
-    def update_many(self, X: pd.DataFrame):
+    def update_many(self, X: IntoDataFrame):
         """Update with a dataframe of samples.
 
+        Any [narwhals](https://github.com/narwhals-dev/narwhals)-compatible eager dataframe
+        (pandas, polars, pyarrow, ...) is accepted.
+
         Parameters
         ----------
         X
             A dataframe of samples.
 
         """
 
-        X_arr = X.values
+        frame = utils.dataframe.into_frame(X)
+        columns = list(frame.columns)
+        X_arr = utils.dataframe.to_numpy(frame)
         mean_arr = X_arr.mean(axis=0)
         cov_arr = np.cov(X_arr.T, ddof=self.ddof)
 
-        n = len(X)
-        mean = dict(zip(X.columns, mean_arr))
+        n = len(frame)
+        mean = dict(zip(columns, mean_arr))
         cov = {
             (i, j): cov_arr[r, c]
-            for (r, i), (c, j) in itertools.combinations_with_replacement(enumerate(X.columns), r=2)
+            for (r, i), (c, j) in itertools.combinations_with_replacement(enumerate(columns), r=2)
         }
 
         self._update_from_state(n=n, mean=mean, cov=cov)
@@ -215,6 +222,7 @@ def _update_from_state(self, n: int, mean: dict, cov: float | dict):
         Raises
         ----------
             KeyError: If an element in `mean` or `cov` is missing.
+
         """
         for i, j in itertools.combinations(sorted(mean.keys()), r=2):
             try:
@@ -264,6 +272,7 @@ def _from_state(cls, n: int, mean: dict, cov: float | dict, *, ddof=1):
         Returns
         ----------
             cls: A new instance of the class with updated covariance matrix.
+
         """
         new = cls(ddof=ddof)
         new._update_from_state(n=n, mean=mean, cov=cov)
@@ -405,25 +414,29 @@ def update(self, x):
         self._w_arr[ids] = w
         self._inv_cov_mat[ix] = 0.5 * (block + block.T)
 
-    def update_many(self, X: pd.DataFrame):
+    def update_many(self, X: IntoDataFrame):
         """Update with a dataframe of samples.
 
+        Any [narwhals](https://github.com/narwhals-dev/narwhals)-compatible eager dataframe
+        (pandas, polars, pyarrow, ...) is accepted.
+
         Parameters
         ----------
         X
             A dataframe of samples.
 
         """
-        ids = self._ensure_features(X.columns)
-        X_arr = np.asarray(X.values, dtype=np.float64)
+        frame = utils.dataframe.into_frame(X)
+        ids = self._ensure_features(frame.columns)
+        X_arr = utils.dataframe.to_numpy(frame)
 
         loc = self._loc_arr[ids].copy()
         w = self._w_arr[ids].copy()
         ix = np.ix_(ids, ids)
         inv_cov = np.asfortranarray(self._inv_cov_mat[ix]) / np.maximum(w, 1)
 
         # update formulas
-        n_batch = len(X)
+        n_batch = len(frame)
         diff = X_arr - loc
         loc = (w * loc + n_batch * X_arr.mean(axis=0)) / (w + n_batch)
         w += n_batch