Skip to content

cloud_io: release client lease before retry backoff sleep#30177

Merged
nvartolomei merged 2 commits intoredpanda-data:devfrom
nvartolomei:nv/cloud-io-release-lease-before-retry-sleep
Apr 17, 2026
Merged

cloud_io: release client lease before retry backoff sleep#30177
nvartolomei merged 2 commits intoredpanda-data:devfrom
nvartolomei:nv/cloud-io-release-lease-before-retry-sleep

Conversation

@nvartolomei
Copy link
Copy Markdown
Contributor

Most methods in cloud_io::remote acquired a client lease from the pool
before the retry loop and held it for the entire duration — including
across retry backoff sleeps. This starves the client pool: other
operations on the same shard block waiting for a client while the
retrying operation is just sleeping.

Move lease acquisition inside the retry loop for the six methods that had
it outside (download_stream, download_object, object_exists,
delete_object, delete_object_batch, list_objects), matching the
pattern already used by upload_object. Additionally, explicitly release
the lease after shutdown() and before sleep_abortable() in all eight
methods so the client is returned to the pool during backoff.

As a side effect, the retry permit check (which tests the caller's abort
source) now happens before attempting to acquire a client from the pool.

Backports Required

  • none - not a bug fix
  • none - this is a backport
  • none - issue does not exist in previous branches
  • none - papercut/not impactful enough to backport
  • v26.1.x
  • v25.3.x
  • v25.2.x

Release Notes

  • none

Acquire the client lease inside the retry loop and explicitly release it
after shutdown, before sleeping for backoff. Previously most methods
acquired the lease once before the loop and held it across all retries,
starving the pool while sleeping.

As a side effect, the retry permit check (which tests the caller's abort
source) now happens before attempting to acquire a client from the pool.
Copy link
Copy Markdown
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Updates cloud_io::remote retry loops to avoid starving the cloud_storage_clients::client_pool by not holding a client lease across backoff sleeps, improving concurrency/fairness for other operations.

Changes:

  • Move client lease acquisition inside retry loops for several remote operations (download/head/delete/list) to avoid holding leases while sleeping.
  • Explicitly drop the lease after shutdown() before executing retry backoff sleeps.
  • Remove now-unneeded ssx:semaphore/ssx:watchdog build dependencies from the cloud_io:remote Bazel target.

Reviewed changes

Copilot reviewed 2 out of 2 changed files in this pull request and generated 8 comments.

File Description
src/v/cloud_io/remote.cc Refactors retry loops to acquire/release client leases per-attempt and release before backoff sleep.
src/v/cloud_io/BUILD Drops unused implementation deps from the remote library target.

Comment thread src/v/cloud_io/remote.cc
Comment thread src/v/cloud_io/remote.cc
Comment thread src/v/cloud_io/remote.cc
Comment thread src/v/cloud_io/remote.cc
Comment thread src/v/cloud_io/remote.cc
Comment thread src/v/cloud_io/remote.cc
Comment thread src/v/cloud_io/remote.cc
Comment thread src/v/cloud_io/remote.cc
@vbotbuildovich
Copy link
Copy Markdown
Collaborator

CI test results

test results on build#83192
test_status test_class test_method test_arguments test_kind job_url passed reason test_history
FLAKY(PASS) ShadowLinkingReplicationTests test_with_restart null integration https://buildkite.com/redpanda/redpanda/builds/83192#019d92a2-4ee5-4af7-8012-181c83a7a9db 10/11 Test PASSES after retries.No significant increase in flaky rate(baseline=0.0000, p0=1.0000, reject_threshold=0.0100. adj_baseline=0.1000, p1=0.3487, trust_threshold=0.5000) https://redpanda.metabaseapp.com/dashboard/87-tests?tab=142-dt-individual-test-history&test_class=ShadowLinkingReplicationTests&test_method=test_with_restart
FLAKY(PASS) RedpandaNodeOperationsSmokeTest test_node_ops_smoke_test {"cloud_storage_type": 1, "mixed_versions": false} integration https://buildkite.com/redpanda/redpanda/builds/83192#019d92a1-3599-46af-8d0d-4954a87bdbab 10/11 Test PASSES after retries.No significant increase in flaky rate(baseline=0.0049, p0=1.0000, reject_threshold=0.0100. adj_baseline=0.1000, p1=0.3487, trust_threshold=0.5000) https://redpanda.metabaseapp.com/dashboard/87-tests?tab=142-dt-individual-test-history&test_class=RedpandaNodeOperationsSmokeTest&test_method=test_node_ops_smoke_test

Comment thread src/v/cloud_io/remote.cc

lease.client->shutdown();
{
auto _ = std::move(lease);
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

std::ignore = std::move(lease);?

@nvartolomei nvartolomei merged commit 757ab24 into redpanda-data:dev Apr 17, 2026
24 checks passed
@vbotbuildovich
Copy link
Copy Markdown
Collaborator

/backport v26.1.x

Comment thread src/v/cloud_io/remote.cc

lease.client->shutdown();
{
auto _ = std::move(lease);
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

use std:ignore =

@nvartolomei
Copy link
Copy Markdown
Contributor Author

nvartolomei commented Apr 21, 2026 via email

@dotnwat
Copy link
Copy Markdown
Member

dotnwat commented Apr 22, 2026

Moving to std::ignore is noop and doesn’t achieve the goal.

I don't think so. Pretty sure a valid use case for std::ignore = is as a replacement for casting to void, or assigning to _ or unused or any other throwaway name, all of which still evaluate the expression, which IIUC is all you're trying to achieve. Is there something else special about std::ignore i'm missing?

@nvartolomei
Copy link
Copy Markdown
Contributor Author

nvartolomei commented Apr 22, 2026 via email

@dotnwat
Copy link
Copy Markdown
Member

dotnwat commented Apr 22, 2026

https://godbolt.org/z/xdxzecoYG

On Wed, 22 Apr 2026 at 21:54, Noah Watkins @.> wrote: dotnwat left a comment (redpanda-data/redpanda#30177) <#30177?email_source=notifications&email_token=AAEETWPGDXFAFCKOIX5FYLD4XEWO3A5CNFSNUABFM5UWIORPF5TWS5BNNB2WEL2JONZXKZKDN5WW2ZLOOQXTIMRZHE4TAMZYHEYKM4TFMFZW63VMON2GC5DFL5RWQYLOM5S2KZLWMVXHJNLQOJPWG33NNVSW45C7N5YGK3S7MNWGSY3L#issuecomment-4299903890> Moving to std::ignore is noop and doesn’t achieve the goal. I don't think so. Pretty sure a valid use case for std::ignore = is as a replacement for casting to void, or assigning to _ or unused or any other throwaway name, all of which don't evaluation of the expression, which IIUC is all you're trying to achieve. Is there something else special about std::ignore i'm missing? — Reply to this email directly, view it on GitHub <#30177?email_source=notifications&email_token=AAEETWPGDXFAFCKOIX5FYLD4XEWO3A5CNFSNUABFM5UWIORPF5TWS5BNNB2WEL2JONZXKZKDN5WW2ZLOOQXTIMRZHE4TAMZYHEYKM4TFMFZW63VMON2GC5DFL5RWQYLOM5S2KZLWMVXHJNLQOJPWG33NNVSW45C7N5YGK3S7MNWGSY3L#issuecomment-4299903890>, or unsubscribe https://github.com/notifications/unsubscribe-auth/AAEETWM3KT5QDW32GYHGDTL4XEWO3AVCNFSM6AAAAACX2T7PLKVHI2DSMVQWIX3LMV43OSLTON2WKQ3PNVWWK3TUHM2DEOJZHEYDGOBZGA . You are receiving this because you modified the open/close state.Message ID: @.>

Yeh, exactly you need to make sure that the statement does something. std::move is just a cast, so you need something like std::exchange.

@nvartolomei
Copy link
Copy Markdown
Contributor Author

nvartolomei commented Apr 23, 2026 via email

@dotnwat
Copy link
Copy Markdown
Member

dotnwat commented Apr 23, 2026

Not following anymore. What’s your proposal?

On Thu, 23 Apr 2026 at 00:20, Noah Watkins @.> wrote: dotnwat left a comment (redpanda-data/redpanda#30177) <#30177?email_source=notifications&email_token=AAEETWNTIMOWD4I2YGQFPHL4XFHUFA5CNFSNUABFM5UWIORPF5TWS5BNNB2WEL2JONZXKZKDN5WW2ZLOOQXTIMZQGA2TOMBSG44KM4TFMFZW63VMON2GC5DFL5RWQYLOM5S2KZLWMVXHJNLQOJPWG33NNVSW45C7N5YGK3S7MNWGSY3L#issuecomment-4300570278> https://godbolt.org/z/xdxzecoYG … <#m_-5804177704652785510_> On Wed, 22 Apr 2026 at 21:54, Noah Watkins @.> wrote: dotnwat left a comment (redpanda-data/redpanda#30177 <#30177>) <#30177 <#30177>?email_source=notifications&email_token=AAEETWPGDXFAFCKOIX5FYLD4XEWO3A5CNFSNUABFM5UWIORPF5TWS5BNNB2WEL2JONZXKZKDN5WW2ZLOOQXTIMRZHE4TAMZYHEYKM4TFMFZW63VMON2GC5DFL5RWQYLOM5S2KZLWMVXHJNLQOJPWG33NNVSW45C7N5YGK3S7MNWGSY3L#issuecomment-4299903890> Moving to std::ignore is noop and doesn’t achieve the goal. I don't think so. Pretty sure a valid use case for std::ignore = is as a replacement for casting to void, or assigning to _ or unused or any other throwaway name, all of which don't evaluation of the expression, which IIUC is all you're trying to achieve. Is there something else special about std::ignore i'm missing? — Reply to this email directly, view it on GitHub <#30177 <#30177>?email_source=notifications&email_token=AAEETWPGDXFAFCKOIX5FYLD4XEWO3A5CNFSNUABFM5UWIORPF5TWS5BNNB2WEL2JONZXKZKDN5WW2ZLOOQXTIMRZHE4TAMZYHEYKM4TFMFZW63VMON2GC5DFL5RWQYLOM5S2KZLWMVXHJNLQOJPWG33NNVSW45C7N5YGK3S7MNWGSY3L#issuecomment-4299903890>, or unsubscribe https://github.com/notifications/unsubscribe-auth/AAEETWM3KT5QDW32GYHGDTL4XEWO3AVCNFSM6AAAAACX2T7PLKVHI2DSMVQWIX3LMV43OSLTON2WKQ3PNVWWK3TUHM2DEOJZHEYDGOBZGA https://github.com/notifications/unsubscribe-auth/AAEETWM3KT5QDW32GYHGDTL4XEWO3AVCNFSM6AAAAACX2T7PLKVHI2DSMVQWIX3LMV43OSLTON2WKQ3PNVWWK3TUHM2DEOJZHEYDGOBZGA . You are receiving this because you modified the open/close state.Message ID: @.> Yeh, exactly you need to make sure that the statement does something. std::move is just a cast, so you need something like std::exchange. — Reply to this email directly, view it on GitHub <#30177?email_source=notifications&email_token=AAEETWNTIMOWD4I2YGQFPHL4XFHUFA5CNFSNUABFM5UWIORPF5TWS5BNNB2WEL2JONZXKZKDN5WW2ZLOOQXTIMZQGA2TOMBSG44KM4TFMFZW63VMON2GC5DFL5RWQYLOM5S2KZLWMVXHJNLQOJPWG33NNVSW45C7N5YGK3S7MNWGSY3L#issuecomment-4300570278>, or unsubscribe https://github.com/notifications/unsubscribe-auth/AAEETWN4RDX6X5KCJ3LRTRD4XFHUFAVCNFSM6AAAAACX2T7PLKVHI2DSMVQWIX3LMV43OSLTON2WKQ3PNVWWK3TUHM2DGMBQGU3TAMRXHA . You are receiving this because you modified the open/close state.Message ID: @.**>

Single underscore name as placeholder pretty sure isn't standard until c++26?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

5 participants