Skip to content

Fix delegator state inconsistency on put/delete pairs#5331

Open
DracoLi wants to merge 4 commits into
masterfrom
dl/diff-delegator-cancel-out
Open

Fix delegator state inconsistency on put/delete pairs#5331
DracoLi wants to merge 4 commits into
masterfrom
dl/diff-delegator-cancel-out

Conversation

@DracoLi
Copy link
Copy Markdown
Contributor

@DracoLi DracoLi commented May 1, 2026

Why this should be merged

Defensive fix for an internal consistency issue in baseStakers and diffStakers. The pending writes don't cancel out matching PutDelegator / DeleteDelegator pairs, so both addedDelegators and deletedDelegators can end up populated for the same TxID, inconsistent with the in-memory btree.

There's no way to trigger this normally. The bug requires the same delegator (same staker.TxID) to be both added and removed within a single diff, which no current code path does.

How this works

Cancel out matching put/delete pairs in two places:

  • baseStakers.PutDelegator / DeleteDelegator: when a call is the inverse of an earlier call in the pending writes, undo the prior call instead of recording the new one.
  • diffStakers.PutDelegator / DeleteDelegator: same shape applied to the in-flight diff. Without this, GetDelegatorIterator on a delete-then-readd diff would filter the re-added delegator out.

How this was tested

  • TestDiffStakersDelegatorCancelOut — diff-level read correctness via GetDelegatorIterator / GetStakerIterator for both delete-then-put and put-then-delete. The delete-then-put subtest fails without the fix: deletedDelegators still contains the staker, so the iterator filters out the parent-supplied entry and returns nil.

  • TestBaseStakersDelegatorCancelOutPersistence — writes the diff through writeCurrentDelegatorDiff into a memdb-backed linkeddb and checks the persisted row for both orderings. The delete-then-put subtest fails without the fix: writeCurrentDelegatorDiff sees the staker in deletedDelegators and removes the DB row.

Need to be documented in RELEASES.md?

No

@DracoLi DracoLi force-pushed the dl/diff-delegator-cancel-out branch 4 times, most recently from b511283 to e710f54 Compare May 1, 2026 19:02
@DracoLi DracoLi added the ai-review Trigger Claude Code Review CI on this PR label May 8, 2026
@DracoLi DracoLi force-pushed the dl/diff-delegator-cancel-out branch 3 times, most recently from e75533e to 30598bd Compare May 10, 2026 21:05
@DracoLi DracoLi marked this pull request as ready for review May 11, 2026 13:05
@DracoLi DracoLi requested a review from a team as a code owner May 11, 2026 13:05
Copilot AI review requested due to automatic review settings May 11, 2026 13:05
Copy link
Copy Markdown
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Defensive fix to keep delegator diffs internally consistent when the same delegator is both put and deleted within a single diff cycle, ensuring pending writes and iterators don’t disagree (and persistence doesn’t accidentally delete or re-add rows).

Changes:

  • Cancel out inverse PutDelegator/DeleteDelegator pairs in baseStakers pending writes.
  • Apply the same cancel-out behavior in diffStakers to keep GetDelegatorIterator/GetStakerIterator correct under delete-then-readd and put-then-delete sequences.
  • Add regression tests covering iterator correctness and persisted DB row behavior; fix iterator lifecycle management in existing diff tests.

Reviewed changes

Copilot reviewed 4 out of 4 changed files in this pull request and generated 4 comments.

File Description
vms/platformvm/state/stakers.go Adds cancel-out logic for delegator put/delete pairs in both base and diff staker trackers to prevent inconsistent added/deleted sets.
vms/platformvm/state/stakers_test.go Adds tests for delegator cancel-out behavior at diff and persistence layers (but currently has iterator double-release issues).
vms/platformvm/state/diff_test.go Ensures iterators are properly released in delegator iterator tests.
vms/platformvm/state/BUILD.bazel Adds the linkeddb dependency needed by the new persistence test.

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Comment thread vms/platformvm/state/stakers_test.go Outdated
Comment thread vms/platformvm/state/stakers_test.go Outdated
Comment thread vms/platformvm/state/stakers_test.go Outdated
Comment thread vms/platformvm/state/stakers_test.go Outdated
@DracoLi DracoLi moved this to In Progress 🏗️ in avalanchego May 11, 2026
@DracoLi DracoLi force-pushed the dl/diff-delegator-cancel-out branch from 11bda51 to a0ba33e Compare May 19, 2026 14:11
Comment thread vms/platformvm/state/stakers_test.go Outdated
yacovm
yacovm previously approved these changes May 19, 2026
Copy link
Copy Markdown
Contributor

@yacovm yacovm left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't think this can be triggered in reality because we can't add and remove and vice versa the same delegator in a block, so it should be fine adding this as a sanity check for future P-chain development.

@rrazvan1
Copy link
Copy Markdown
Contributor

if we want to support this, we need to cover the situation where we delete + put, but we put a delegator with updated fields, for example a different weight.

This test is failing:

func TestStateDiffIntegration_PutThenDeleteSameDelegatorTxIDWithDifferentWeight(t *testing.T) {
	state := newTestState(t, memdb.New())

	diff, err := NewDiffOn(state, StakerAdditionAfterDeletionAllowed)
	require.NoError(t, err)

	delegator := newTestStaker(ids.GenerateTestID(), ids.GenerateTestNodeID())
	delegator.Weight = 2
	require.NoError(t, diff.PutCurrentDelegator(delegator))
	require.NoError(t, diff.Apply(state))

	diff, err = NewDiffOn(state, StakerAdditionAfterDeletionAllowed)
	require.NoError(t, err)

	lowerWeightDelegator := *delegator
	lowerWeightDelegator.Weight = delegator.Weight - 1

	require.NoError(t, diff.DeleteCurrentDelegator(delegator))
	require.NoError(t, diff.PutCurrentDelegator(&lowerWeightDelegator))
	require.NoError(t, diff.Apply(state))

	delegatorIterator, err := state.GetCurrentDelegatorIterator(delegator.SubnetID, delegator.NodeID)
	require.NoError(t, err)
	defer delegatorIterator.Release()

	require.Equal(t, []*Staker{&lowerWeightDelegator}, iterator.ToSlice(delegatorIterator))
}

@DracoLi DracoLi force-pushed the dl/diff-delegator-cancel-out branch from f883b2b to 80b421a Compare May 23, 2026 18:34
if _, ok := validatorDiff.deletedDelegators[staker.TxID]; ok {
// Cancel a prior delete only when the re-added staker is identical.
// A mismatch (e.g. different weight) is recorded as a replacement.
if existing, ok := validatorDiff.deletedDelegators[staker.TxID]; ok && existing.Equals(staker) {
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

when I revert this change, the tests still pass 🫠

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

ai-review Trigger Claude Code Review CI on this PR

Projects

Status: In Progress 🏗️

Development

Successfully merging this pull request may close these issues.

4 participants