Skip to content

DAOS-18891 object: retry if vos_update_end return -DER_AGAIN - b28#18265

Open
Nasf-Fan wants to merge 1 commit into
release/2.8from
Nasf-Fan/DAOS-18891_b28
Open

DAOS-18891 object: retry if vos_update_end return -DER_AGAIN - b28#18265
Nasf-Fan wants to merge 1 commit into
release/2.8from
Nasf-Fan/DAOS-18891_b28

Conversation

@Nasf-Fan
Copy link
Copy Markdown
Contributor

On server side, for an update operation, there may be CPU yield between related vos_update_begin() and vos_update_end(). During yield interval, the object that is held via vos_update_begin() maybe evicted by others, such as by another failed modification against the same object shard or evicted under md-on-ssd mode. So vos_update_end() logic will check such case and return -DER_AGAIN instead of -DER_TX_RESTART to the caller for notification. And then related caller needs to retry update instead of fail out.

The patch also adds initialization for some local varilables in object module to avoid random corruption when handle some failure cases.

Steps for the author:

  • Commit message follows the guidelines.
  • Appropriate Features or Test-tag pragmas were used.
  • Appropriate Functional Test Stages were run.
  • At least two positive code reviews including at least one code owner from each category referenced in the PR.
  • Testing is complete. If necessary, forced-landing label added and a reason added in a comment.

After all prior steps are complete:

  • Gatekeeper requested (daos-gatekeeper added as a reviewer).

@github-actions
Copy link
Copy Markdown

Ticket title is 'osa/online_extend.py:OSAOnlineExtend.test_osa_online_extend_drain_after_rebuild - DER_TX_RESTART(-2025)'
Status is 'In Review'
Labels: 'ci_master_weekly,weekly_test'
Job should run at elevated priority (1)
https://daosio.atlassian.net/browse/DAOS-18891

@github-actions github-actions Bot added the priority Ticket has high priority (automatically managed) label May 16, 2026
@Nasf-Fan Nasf-Fan marked this pull request as ready for review May 18, 2026 14:17
@Nasf-Fan Nasf-Fan requested review from a team as code owners May 18, 2026 14:17
@Nasf-Fan Nasf-Fan force-pushed the Nasf-Fan/DAOS-18891_b28 branch 4 times, most recently from 6c410cc to 99d47fd Compare May 21, 2026 02:47
On server side, for an update operation, there may be CPU yield between
related vos_update_begin() and vos_update_end(). During yield interval,
the object that is held via vos_update_begin() maybe evicted by others,
such as by another failed modification against the same object shard or
evicted under md-on-ssd mode. So vos_update_end() logic will check such
case and return -DER_AGAIN instead of -DER_TX_RESTART to the caller for
notification. And then related caller needs to retry update instead of
fail out.

The patch also adds initialization for some local varilables in object
module to avoid random corruption when handle some failure cases.

Signed-off-by: Fan Yong <fan.yong@hpe.com>
@Nasf-Fan Nasf-Fan force-pushed the Nasf-Fan/DAOS-18891_b28 branch from 99d47fd to 9ae3b4f Compare May 21, 2026 08:14
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

priority Ticket has high priority (automatically managed)

Development

Successfully merging this pull request may close these issues.

1 participant