Skip to content

[ENH](cli): add retry logic with exponential backoff to copy command#7102

Open
rescrv wants to merge 1 commit into
mainfrom
rescrv/copy-robustly
Open

[ENH](cli): add retry logic with exponential backoff to copy command#7102
rescrv wants to merge 1 commit into
mainfrom
rescrv/copy-robustly

Conversation

@rescrv
Copy link
Copy Markdown
Contributor

@rescrv rescrv commented May 20, 2026

Description of changes

Wrap all network calls in the copy command with exponential backoff
retries using the backon crate. This makes the copy operation robust
against transient network errors, rate limiting, and server errors.

Key changes:

  • Add backon dependency for retry support
  • Classify errors as retryable (timeouts, 429, 5xx, network) vs
    deterministic (400, 404, validation)
  • Wrap list, get, create, search, add, and count calls in retry logic
  • Use get_or_create_collection for idempotent collection creation
  • Verify target collection is empty after get_or_create to prevent
    partial-copy corruption
  • Add unit tests for retryable error classification

Test plan

CI + will run a local copy during review

Migration plan

N/A

Observability plan

N/A

Documentation Changes

N/A

Co-authored-by: AI

Wrap all network calls in the copy command with exponential backoff
retries using the backon crate. This makes the copy operation robust
against transient network errors, rate limiting, and server errors.

Key changes:
- Add backon dependency for retry support
- Classify errors as retryable (timeouts, 429, 5xx, network) vs
  deterministic (400, 404, validation)
- Wrap list, get, create, search, add, and count calls in retry logic
- Use get_or_create_collection for idempotent collection creation
- Verify target collection is empty after get_or_create to prevent
  partial-copy corruption
- Add unit tests for retryable error classification

Co-authored-by: AI
@github-actions
Copy link
Copy Markdown

Reviewer Checklist

Please leverage this checklist to ensure your code review is thorough before approving

Testing, Bugs, Errors, Logs, Documentation

  • Can you think of any use case in which the code does not behave as intended? Have they been tested?
  • Can you think of any inputs or external events that could break the code? Is user input validated and safe? Have they been tested?
  • If appropriate, are there adequate property based tests?
  • If appropriate, are there adequate unit tests?
  • Should any logging, debugging, tracing information be added or removed?
  • Are error messages user-friendly?
  • Have all documentation changes needed been made?
  • Have all non-obvious changes been commented?

System Compatibility

  • Are there any potential impacts on other parts of the system or backward compatibility?
  • Does this change intersect with any items on our roadmap, and if so, is there a plan for fitting them together?

Quality

  • Is this code of a unexpectedly high quality (Readability, Modularity, Intuitiveness)

@rescrv rescrv requested a review from itaismith May 20, 2026 21:45
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant