cl/test: rnot: add cloud topic workloads#30435
Open
Lazin wants to merge 3 commits into
Open
Conversation
Contributor
There was a problem hiding this comment.
Pull request overview
Extends the shadow-linking random node operations test to also exercise cloud-topics and tiered-cloud-topics storage modes during node operations, including enabling the necessary cluster config/feature flags and allow-listing expected cloud-topics shutdown/retry log messages.
Changes:
- Enable
cloud_topics_enabledon both clusters and activate the explicit-onlytiered_cloud_topicsfeature before topic creation. - Increase preallocated client nodes and ducktape cluster node count to support two additional concurrent workloads.
- Add two new workloads (
cloud-topic,tiered-cloud-topic) usingredpanda.storage.modetopic config and allow-list expected cloud-topics shadow-link logs.
Comment on lines
264
to
267
| extra_rp_conf={ | ||
| "group_new_member_join_timeout": 3000, | ||
| CLOUD_TOPICS_CONFIG_STR: True, | ||
| }, |
Comment on lines
269
to
284
| @@ -276,6 +280,7 @@ def __init__(self, test_ctx: TestContext): | |||
| "retention_local_trim_interval": 5000, | |||
| "partition_autobalancing_tick_interval_ms": 2000, | |||
| "group_new_member_join_timeout": 3000, | |||
| CLOUD_TOPICS_CONFIG_STR: True, | |||
| }, | |||
Comment on lines
+329
to
+333
| # enabled on both clusters before any tiered_cloud topic can be | ||
| # created. | ||
| self.source_cluster.service.set_feature_active( | ||
| "tiered_cloud_topics", True, timeout_sec=30 | ||
| ) |
Collaborator
CI test resultstest results on build#84286
test results on build#84317test results on build#84334
|
745e0db to
c0ffaf5
Compare
Collaborator
Retry command for Build#84317please wait until all jobs are finished before running the slash command |
Adds cloud-topic and tiered-cloud-topic workloads to the shadow linking random node ops test so we exercise plain cloud and tiered_cloud storage modes alongside the existing si, compacted, and transactional workloads. Enables the explicit-only tiered_cloud_topics feature on both clusters and CLOUD_TOPICS_CONFIG_STR cluster-wide; allow-lists the expected cloud-topics shutdown warnings. Signed-off-by: Evgeny Lazin <[email protected]>
For storage.mode=cloud topics, replicate_at_offset previously sent every input batch through stage_write/execute_write and wrapped each one as a ctp_placeholder. The placeholder encoding drops the record key, so for control records (e.g. transaction commit/abort markers) the original key bytes are lost and rm_stm's parse_control_batch throws std::out_of_range on the empty iobuf, halting state machine apply at the marker offset. Split the input list into user data batches (raft_data with !is_control()) and pass-through batches (raft_configuration, tx_fence, control batches, etc.). Only data batches are uploaded to L0 and wrapped as ctp_placeholders; the rest are forwarded to the write_at_offset_stm unchanged. The original input ordering is preserved by interleaving the generated placeholders with the pass-through batches. Signed-off-by: Evgeny Lazin <[email protected]>
Adds a new "flipping" workload_set matrix variant. A single workload runs against flipping-storage-topic while a background daemon thread toggles redpanda.storage.mode between cloud and tiered_cloud every 3 seconds on the source. Transient alter-config failures (leader changes, partition movement) are logged and retried on the next tick; the target config is not separately verified. Wired through ClusterLinkingWorkloadSpec via optional flip_storage_modes / flip_interval_seconds fields so other workloads can opt in if needed. Signed-off-by: Evgeny Lazin <[email protected]>
6c0c15f to
b0d9665
Compare
pgellert
reviewed
May 13, 2026
Contributor
pgellert
left a comment
There was a problem hiding this comment.
Looks good to me, but I'll let someone from the cloud topics team approve
Comment on lines
+1283
to
+1285
| for (auto&& b : passthrough_batches) { | ||
| final_batches.push_back(std::move(b)); | ||
| } |
Contributor
There was a problem hiding this comment.
I think this would be simpler here:
Suggested change
| for (auto&& b : passthrough_batches) { | |
| final_batches.push_back(std::move(b)); | |
| } | |
| final_batches = std::move(passthrough_batches); |
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Adds cloud-topic and tiered-cloud-topic workloads to the shadow linking random node ops test so we exercise plain cloud and tiered_cloud storage modes alongside the existing si, compacted, and transactional workloads. Enables the explicit-only tiered_cloud_topics feature on both clusters and CLOUD_TOPICS_CONFIG_STR cluster-wide; allow-lists the expected cloud-topics shutdown warnings.
Fixes the bug in the write-at-offset code path in the cloud topics frontend. The frontend was converting batches of all types as placeholders. This caused the stall in the target cluster. The second commit in the PR fixes this.
Finally, the test adds new workload that constantly flips between
cloudandtiered_cloudmodes. The goal is to have a mix ofraft_dataandct_placeholderbatches in the partition which is being shadowed.Backports Required
Release Notes