Fix MetricCollection state comparison for nested sequence states by omkar-334 · Pull Request #3337 · Lightning-AI/torchmetrics

omkar-334 · 2026-03-17T20:23:14Z

What does this PR do?

Before submitting

Was this discussed/agreed via a Github issue? (no need for typos and docs improvements)
Did you read the contributor guideline, Pull Request section?
Did you make sure to update the docs?
Did you write any new necessary tests?

PR review

Fixes a bug where MetricCollection crashes while auto-merging compute groups for metrics with nested sequence state, such as
MeanAveragePrecision.

Previously, _equal_metric_states assumed that list-valued state always contained tensors and accessed .shape directly on each element. This
breaks for metrics whose state contains tuples or other nested structures, leading to:

AttributeError: 'tuple' object has no attribute 'shape'

Fix - i replaced the tensor-only list comparison with a recursive state comparision function

📚 Documentation preview 📚: https://torchmetrics--3337.org.readthedocs.build/en/3337/

Signed-off-by: Omkar Kabde <omkarkabde@gmail.com>

omkar-334 · 2026-03-17T20:24:28Z

Before -

After -

codecov · 2026-03-18T08:57:14Z

Codecov Report

❌ Patch coverage is 9.09091% with 10 lines in your changes missing coverage. Please review.
✅ Project coverage is 36%. Comparing base (d184220) to head (1f374cd).

❌ Your project check has failed because the head coverage (36%) is below the target coverage (95%). You can increase the head coverage or adjust the target coverage.

Additional details and impacted files

@@          Coverage Diff           @@
##           master   #3337   +/-   ##
======================================
- Coverage      37%     36%   -0%     
======================================
  Files         349     349           
  Lines       19901   19907    +6     
======================================
+ Hits         7264    7265    +1     
- Misses      12637   12642    +5

🚀 New features to boost your workflow:

❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.

Copilot

Pull request overview

Fixes a crash in MetricCollection when auto-merging compute groups for metrics whose states contain nested sequences (e.g., list of tuples), as reported in #3335.

Changes:

Introduced a recursive state-value comparison helper to correctly compare nested structures in metric states.
Updated MetricCollection._equal_metric_states to use the new recursive comparison.
Added a unit test covering nested sequence state values to prevent regressions.

Reviewed changes

Copilot reviewed 2 out of 2 changed files in this pull request and generated no comments.

File	Description
`src/torchmetrics/collections.py`	Replaces tensor-only list state comparison with a recursive comparator that supports nested sequences/mappings.
`tests/unittests/bases/test_collections.py`	Adds a regression test ensuring compute group merging works with nested sequence state values.

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

You can also share your feedback on Copilot code review. Take the survey.

omkar-334 · 2026-04-05T12:03:08Z

@justusschock all tests are passing.... can you review this?

m-matthias · 2026-04-20T11:33:05Z

Is there any update on this? It would be great if metric collections containing e.g. MeanAveragePrecision can be used with compute_groups enabled.

Borda

Thanks for tracking down this issue and the PR is well-targeted. A few things stand out:

The root cause was clear: _equal_metric_states used isinstance(state1, list) which misses tuple states that MeanAveragePrecision stores. The fix generalizes this to all Sequence types via a recursive helper.

The new _equal_state_value function is clean -- it handles Tensor, Mapping, Sequence (with string exclusion), and falls back to direct == for primitives. No new imports needed, no API changes.

The test case is well-constructed: DummyNestedListMetric faithfully reproduces the MeanAveragePrecision state shape (list[tuple[tensor, tensor]]), and the assertions verify both the compute-group merging and the state comparison.

Thematic areas to consider:

The helper function is internal (not exported in init.py) which is correct, but the docstring could benefit from mentioning what types it supports.
For future robustness, the MeanAveragePrecision reproduction in the linked issue would be a valuable addition (or a comment noting the fix covers it).
Consider whether dict states with tensor values are explicitly tested somewhere -- the recursive Mapping branch is new and worth confirming.
I've left a couple of minor observations below.

Borda · 2026-04-21T10:13:16Z

    return string[: -len(suffix)] if string.endswith(suffix) else string


+def _equal_state_value(state1: Any, state2: Any) -> bool:


The _equal_state_value docstring is one line. Consider listing supported types for callers: e.g., "Recursively compare metric state values. Supports: Tensor (shape+value), Mapping (key+recursive value), Sequence/str-excluded (length+recursive element), and primitives (direct ==)."

Borda · 2026-04-21T10:13:45Z


+def _equal_state_value(state1: Any, state2: Any) -> bool:
+    """Recursively compare metric state values while preserving structure checks."""
+    if type(state1) is not type(state2):


type(state1) is not type(state2) is intentionally strict (exact type match), which is the right call here -- but worth a comment since most Python code uses != or isinstance. Consider # noqa: E721 or a short note.

Borda · 2026-04-21T10:14:20Z

+    if isinstance(state1, Mapping):
+        return state1.keys() == state2.keys() and all(_equal_state_value(state1[k], state2[k]) for k in state1)
+
+    if isinstance(state1, Sequence) and not isinstance(state1, str):
+        return len(state1) == len(state2) and all(_equal_state_value(s1, s2) for s1, s2 in zip(state1, state2))


The Mapping and Sequence branches use generator expressions with all(). For very deep nesting (e.g., list of list of list of ...) this could hit Python recursion limits. In practice metric states are rarely >3 levels deep, so this is theoretical -- but note the limit if it matters for your use case.

omkar-334 added 2 commits March 18, 2026 01:48

add _equal_state_value function

b0a5e37

add test for nested metrics

31a497b

Signed-off-by: Omkar Kabde <omkarkabde@gmail.com>

omkar-334 requested review from SkafteNicki, justusschock and lantiga as code owners March 17, 2026 20:23

Borda added the bug / fix Something isn't working label Mar 18, 2026

Borda requested a review from Copilot March 18, 2026 15:09

Copilot started reviewing on behalf of Borda March 18, 2026 15:10 View session

Copilot AI reviewed Mar 18, 2026

View reviewed changes

omkar-334 added 2 commits March 19, 2026 13:42

Merge branch 'master' into fix-metric

a7620d8

Merge branch 'master' into fix-metric

f1e2149

Merge branch 'master' into fix-metric

1f374cd

Borda reviewed Apr 21, 2026

View reviewed changes

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Fix MetricCollection state comparison for nested sequence states#3337

Fix MetricCollection state comparison for nested sequence states#3337
omkar-334 wants to merge 5 commits intoLightning-AI:masterfrom
omkar-334:fix-metric

omkar-334 commented Mar 17, 2026 •

edited by github-actions Bot

Loading

Uh oh!

omkar-334 commented Mar 17, 2026 •

edited

Loading

Uh oh!

codecov Bot commented Mar 18, 2026 •

edited

Loading

Uh oh!

Copilot AI left a comment

Uh oh!

omkar-334 commented Apr 5, 2026

Uh oh!

m-matthias commented Apr 20, 2026

Uh oh!

Borda left a comment

Uh oh!

Borda Apr 21, 2026

Uh oh!

Borda Apr 21, 2026

Uh oh!

Borda Apr 21, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

4 participants

		return string[: -len(suffix)] if string.endswith(suffix) else string


		def _equal_state_value(state1: Any, state2: Any) -> bool:

Conversation

omkar-334 commented Mar 17, 2026 • edited by github-actions Bot Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

What does this PR do?

Uh oh!

omkar-334 commented Mar 17, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

codecov Bot commented Mar 18, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Codecov Report

Uh oh!

Copilot AI left a comment

Choose a reason for hiding this comment

Pull request overview

Reviewed changes

Uh oh!

omkar-334 commented Apr 5, 2026

Uh oh!

m-matthias commented Apr 20, 2026

Uh oh!

Borda left a comment

Choose a reason for hiding this comment

Uh oh!

Borda Apr 21, 2026

Choose a reason for hiding this comment

Uh oh!

Borda Apr 21, 2026

Choose a reason for hiding this comment

Uh oh!

Borda Apr 21, 2026

Choose a reason for hiding this comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

4 participants

omkar-334 commented Mar 17, 2026 •

edited by github-actions Bot

Loading

omkar-334 commented Mar 17, 2026 •

edited

Loading

codecov Bot commented Mar 18, 2026 •

edited

Loading