server,mcs: pass split reason to scheduling service#10652
Conversation
|
Note Reviews pausedIt looks like this branch is under active development. To avoid overwhelming you with review comments due to an influx of new commits, CodeRabbit has automatically paused this review. You can configure this behavior by changing the Use the following commands to manage reviews:
Use the checkboxes below for quick actions:
📝 WalkthroughWalkthroughAdds an ID guard in region validation, forwards split Reason to the scheduler, enforces affinity constraints (denying splits via heartbeat ChangeSplit), records LOAD split-scatter batches, refactors scheduling→PD header/error mapping with tests, and updates kvproto pinned versions across modules. ChangesAffinity-Constrained Splits
Sequence Diagram(s)sequenceDiagram
participant Client
participant PDServer as PD Server
participant Scheduler as SchedulingService
participant AffMgr as AffinityManager
participant HBStream as HeartbeatStream
Client->>PDServer: AskBatchSplit (includes Reason)
PDServer->>Scheduler: Forward AskBatchSplitRequest (sets Reason)
Scheduler->>AffMgr: CheckSplitAllowed(region)
AffMgr-->>Scheduler: Deny / Allow
alt Deny
Scheduler->>HBStream: Send ChangeSplit (AutoSplitEnabled=false)
Scheduler-->>PDServer: Return UNKNOWN error (no split IDs)
PDServer-->>Client: Return UNKNOWN error
else Allowed and Reason==LOAD
Scheduler->>Scheduler: RecordSplitScatterBatch(originalRegionID, epoch+1, newRegionIDs)
Scheduler-->>PDServer: Return split IDs
PDServer-->>Client: Return split IDs
end
Estimated code review effort🎯 3 (Moderate) | ⏱️ ~20 minutes Possibly related PRs
Suggested reviewers
🚥 Pre-merge checks | ✅ 4 | ❌ 1❌ Failed checks (1 warning)
✅ Passed checks (4 passed)
✏️ Tip: You can configure your own custom pre-merge checks in the settings. ✨ Finishing Touches🧪 Generate unit tests (beta)
Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out. Comment |
There was a problem hiding this comment.
Actionable comments posted: 1
🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.
Inline comments:
In `@pkg/mcs/scheduling/server/grpc_service_test.go`:
- Around line 112-116: The test currently checks
heartbeatResp.GetChangeSplit().GetAutoSplitEnabled() without first asserting
ChangeSplit exists; update the assertion in grpc_service_test.go to first verify
heartbeatResp.GetChangeSplit() is non-nil (or use heartbeatResp.ChangeSplit !=
nil via the generated struct) and then assert GetAutoSplitEnabled() is false;
locate the usage in the test around RegionHeartbeatResponse handling (symbols:
RegionHeartbeatResponse, GetChangeSplit, GetAutoSplitEnabled) and add the nil
check/assert immediately before the AutoSplitEnabled assertion.
🪄 Autofix (Beta)
Fix all unresolved CodeRabbit comments on this PR:
- Push a commit to this branch (recommended)
- Create a new PR with the fixes
ℹ️ Review info
⚙️ Run configuration
Configuration used: defaults
Review profile: CHILL
Plan: Pro
Run ID: 96496609-3e7d-4dc0-904d-6287a904cbb9
📥 Commits
Reviewing files that changed from the base of the PR and between f4813f3 and 509456f4d02eae16090973124aeb3435e060ff5e.
⛔ Files ignored due to path filters (4)
client/go.sumis excluded by!**/*.sumgo.sumis excluded by!**/*.sumtests/integrations/go.sumis excluded by!**/*.sumtools/go.sumis excluded by!**/*.sum
📒 Files selected for processing (7)
client/go.modgo.modpkg/mcs/scheduling/server/grpc_service.gopkg/mcs/scheduling/server/grpc_service_test.goserver/grpc_service.gotests/integrations/go.modtools/go.mod
There was a problem hiding this comment.
🧹 Nitpick comments (1)
pkg/mcs/scheduling/server/grpc_service_test.go (1)
67-73: 💤 Low valueConsider adding a comment explaining the drain loop.
The inner loop that drains the channel is correct but subtle. A brief comment would clarify that it's clearing stale messages to ensure a clean state for subsequent test assertions.
📝 Suggested comment
select { case <-stream.ch: + // Drain any additional stale messages to ensure clean state for { select { case <-stream.ch: default: return } }🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the rest with a brief reason, keep changes minimal, and validate. In `@pkg/mcs/scheduling/server/grpc_service_test.go` around lines 67 - 73, Add a short inline comment above the drain loop that iterates over stream.ch to explain its purpose: note that the for-select loop is intentionally non-blocking (using default) to clear any pending/stale messages from the channel so subsequent test assertions start from a clean state; reference the channel variable stream.ch and the non-blocking select pattern so future readers understand this is a deliberate drain, not an accidental busy-wait or infinite loop.
🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.
Nitpick comments:
In `@pkg/mcs/scheduling/server/grpc_service_test.go`:
- Around line 67-73: Add a short inline comment above the drain loop that
iterates over stream.ch to explain its purpose: note that the for-select loop is
intentionally non-blocking (using default) to clear any pending/stale messages
from the channel so subsequent test assertions start from a clean state;
reference the channel variable stream.ch and the non-blocking select pattern so
future readers understand this is a deliberate drain, not an accidental
busy-wait or infinite loop.
ℹ️ Review info
⚙️ Run configuration
Configuration used: defaults
Review profile: CHILL
Plan: Pro
Run ID: 69718e30-1e32-46c1-81da-f1dbcb797474
📥 Commits
Reviewing files that changed from the base of the PR and between 509456f4d02eae16090973124aeb3435e060ff5e and 398a84d0004fd40835dbfd6d8d5ea2da62c33232.
⛔ Files ignored due to path filters (4)
client/go.sumis excluded by!**/*.sumgo.sumis excluded by!**/*.sumtests/integrations/go.sumis excluded by!**/*.sumtools/go.sumis excluded by!**/*.sum
📒 Files selected for processing (1)
pkg/mcs/scheduling/server/grpc_service_test.go
There was a problem hiding this comment.
🧹 Nitpick comments (1)
server/grpc_service_test.go (1)
33-62: ⚡ Quick winAdd a test case for
ALREADY_BOOTSTRAPPEDmapping.
convertHeadernow mapsschedulingpb.ErrorType_ALREADY_BOOTSTRAPPED, but this branch is not covered in the table yet.Proposed test case addition
{ name: "not bootstrapped", in: &schedulingpb.Error{Type: schedulingpb.ErrorType_NOT_BOOTSTRAPPED, Message: "cluster is not initialized"}, want: &pdpb.Error{Type: pdpb.ErrorType_NOT_BOOTSTRAPPED, Message: "cluster is not initialized"}, }, + { + name: "already bootstrapped", + in: &schedulingpb.Error{Type: schedulingpb.ErrorType_ALREADY_BOOTSTRAPPED, Message: "cluster is already bootstrapped"}, + want: &pdpb.Error{Type: pdpb.ErrorType_ALREADY_BOOTSTRAPPED, Message: "cluster is already bootstrapped"}, + },🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the rest with a brief reason, keep changes minimal, and validate. In `@server/grpc_service_test.go` around lines 33 - 62, Add a test case to the TestConvertSchedulingHeaderPreservesError test table to cover schedulingpb.ErrorType_ALREADY_BOOTSTRAPPED: update the testCases slice inside TestConvertSchedulingHeaderPreservesError to include an entry with name like "already bootstrapped", in set in to &schedulingpb.Error{Type: schedulingpb.ErrorType_ALREADY_BOOTSTRAPPED, Message: "already bootstrapped"} and want set to &pdpb.Error{Type: pdpb.ErrorType_ALREADY_BOOTSTRAPPED, Message: "already bootstrapped"} so the convertHeader mapping for ALREADY_BOOTSTRAPPED is exercised.
🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.
Nitpick comments:
In `@server/grpc_service_test.go`:
- Around line 33-62: Add a test case to the
TestConvertSchedulingHeaderPreservesError test table to cover
schedulingpb.ErrorType_ALREADY_BOOTSTRAPPED: update the testCases slice inside
TestConvertSchedulingHeaderPreservesError to include an entry with name like
"already bootstrapped", in set in to &schedulingpb.Error{Type:
schedulingpb.ErrorType_ALREADY_BOOTSTRAPPED, Message: "already bootstrapped"}
and want set to &pdpb.Error{Type: pdpb.ErrorType_ALREADY_BOOTSTRAPPED, Message:
"already bootstrapped"} so the convertHeader mapping for ALREADY_BOOTSTRAPPED is
exercised.
ℹ️ Review info
⚙️ Run configuration
Configuration used: defaults
Review profile: CHILL
Plan: Pro
Run ID: eb429d38-d9fc-4483-aed3-6d8149a29872
📥 Commits
Reviewing files that changed from the base of the PR and between 398a84d0004fd40835dbfd6d8d5ea2da62c33232 and 9360f1956a8c04df1fc7d6b46e29a0c60fdcaf7e.
📒 Files selected for processing (5)
pkg/core/region.gopkg/mcs/scheduling/server/grpc_service_test.goserver/grpc_service.goserver/grpc_service_test.gotests/server/cluster/cluster_work_test.go
Signed-off-by: lhy1024 <[email protected]>
37f4f72 to
baea05b
Compare
Codecov Report❌ Patch coverage is Additional details and impacted files@@ Coverage Diff @@
## master #10652 +/- ##
==========================================
- Coverage 79.07% 79.06% -0.01%
==========================================
Files 535 535
Lines 72708 72959 +251
==========================================
+ Hits 57492 57687 +195
- Misses 11162 11194 +32
- Partials 4054 4078 +24
Flags with carried forward coverage won't be shown. Click here to find out more. 🚀 New features to boost your workflow:
|
Signed-off-by: lhy1024 <[email protected]>
|
[APPROVALNOTIFIER] This PR is APPROVED This pull-request has been approved by: bufferflies, HunDunDM The full list of commands accepted by this bot can be found here. The pull request process is described here DetailsNeeds approval from an approver in each of these files:
Approvers can indicate their approval by writing |
[LGTM Timeline notifier]Timeline:
|
|
/retest |
Signed-off-by: lhy1024 <[email protected]>
60d426a to
02d8521
Compare
|
/retest |
3 similar comments
|
/retest |
|
/retest |
|
/retest |
|
In response to a cherrypick label: new pull request created to branch |
What problem does this PR solve?
Issue Number: ref #10592, ref #9764
When the scheduling service runs independently,
AskBatchSplitneeds the split reason to keep MCS behavior consistent with the local PD path. Without it, affinity split checks and load-based split-scatter handling cannot work correctly in the MCS path.This PR intentionally carries the minimal split-reason plumbing that overlaps PR 10648 so the affinity fix can be validated independently. PR 10648 remains the owner of the broader split-scatter keyspace and range changes; rebasing either PR after the other merges should drop or preserve the overlapping plumbing without removing this affinity guard.
What is changed and how does it work?
Check List
Tests
Manual test commands:
Release note