*: backport QueryRegion gRPC stream to release-8.5#10628
Conversation
|
This cherry pick PR is for a release branch and has not yet been approved by triage owners. To merge this cherry pick:
DetailsInstructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes-sigs/prow repository. |
|
[APPROVALNOTIFIER] This PR is NOT APPROVED This pull-request has been approved by: The full list of commands accepted by this bot can be found here. DetailsNeeds approval from an approver in each of these files:Approvers can indicate their approval by writing |
|
Important Review skippedAuto reviews are disabled on base/target branches other than the default branch. Please check the settings in the CodeRabbit UI or the ⚙️ Run configurationConfiguration used: defaults Review profile: CHILL Plan: Pro Run ID: You can disable this status message by setting the Use the checkbox below for a quick retry:
✨ Finishing Touches🧪 Generate unit tests (beta)
Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out. Comment |
| Keys: [][]byte{key}, | ||
| }, options.allowFollowerHandle && c.option.getEnableFollowerHandle()) | ||
| if !fallback { | ||
| if err = c.respForErr(metrics.CmdFailDurationGetRegion, start, err, resp.GetHeader()); err != nil { |
There was a problem hiding this comment.
Could we guard resp before calling resp.GetHeader() here? queryRegion can return resp == nil with fallback == false on non-fallback stream errors, which would panic instead of returning the original error.
|
/hold |
Pin kvproto to v0.0.0-20260518092652-f96c651c7702 across all four modules. This is the kvproto release-8.5 branch tip with the QueryRegion stream proto cherry-picked (pingcap/kvproto#1467, commit f96c651), so release-8.5-only protos are retained while QueryRegion becomes available. Signed-off-by: JmPotato <[email protected]>
Backport the server-side QueryRegion gRPC stream so release-8.5 PD can serve batched region queries (key / prev-key / ID lookups). Squashes the server-side portions of the following upstream PRs (ref tikv#8690); client-side changes are intentionally excluded: - tikv#8979 server, core: implement the query region gRPC server. Adds RegionsInfo.QueryRegions, regionTree.searchByKeys/searchByPrevKeys, the GrpcServer.QueryRegion stream handler and server metric. The rate limit and cluster-id-mismatch helpers (rateLimitCheck / errs.ErrMismatchClusterID, added later upstream by tikv#8995) are adapted to the release-8.5 inline idioms (currentFunction limiter + status.Errorf). - tikv#9055 fix: size regionsByID with len(regions)+len(prevRegions)+len(ids). - tikv#9076 add QueryRegion related metrics (pkg/core query_region_* vecs, server query_region_duration_seconds; per-step metrics & guards). - tikv#9196 allow the follower to handle QueryRegion (leader/follower branch using s.cluster + region syncer, mirroring the GetRegion follower path). - tikv#10194 optimize region query with batched lookups (slice.SplitIntoBatches, getRegionsByIDs, batchSearchSize 16 -> 128, drop panic assertions). Signed-off-by: JmPotato <[email protected]>
0a7a02f to
1af8ebf
Compare
Backport master client/pkg/batch Controller from 08d563f and migrate the 8.5 TSO dispatcher onto it as a behavior-preserving refactor. Deltas from master: batch tests keep the 8.5 client/testutil import; the TSO finisher captures the 8.5-specific suffixBits field; connection management remains on the existing 8.5 path for the follow-up connectionctx PR. Signed-off-by: JmPotato <[email protected]>
Backport master client/pkg/connectionctx Manager from 08d563f and migrate the 8.5 TSO stream connection map onto it as a behavior-preserving refactor. Deltas from master: connectionctx tests keep the 8.5 client/testutil import; 8.5 keeps per-dispatcher managers to preserve dc-local TSO behavior; QueryRegion client wiring is left to the follow-up feature PR. Signed-off-by: JmPotato <[email protected]>
Signed-off-by: JmPotato <[email protected]>
Signed-off-by: JmPotato <[email protected]>
Signed-off-by: JmPotato <[email protected]>
| if !ok { | ||
| return errors.New("[pd] invalid value type for EnableQueryRegion option, it should be bool") | ||
| } | ||
| c.option.setEnableQueryRegion(enable) |
There was a problem hiding this comment.
When EnableQueryRegion is turned off dynamically, the existing regionClient keeps its daemon and streams alive until Close. Should we close/release it on disable to actually stop the stream path?
| } | ||
| _, ok := c.connectionCtxs[url] | ||
| if !overwriteFlag && ok { | ||
| return |
There was a problem hiding this comment.
If Store returns early because this URL already exists, the newly-created stream/context from the caller is left uncanceled. Could Store return whether it stored the context, or should callers cancel on the false path?
Signed-off-by: JmPotato <[email protected]>
|
@JmPotato: The following test failed, say
Full PR test history. Your PR dashboard. DetailsInstructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes-sigs/prow repository. I understand the commands that are listed here. |
|
This single PR has too many changes, temporarily closed. Will resubmit the PR after splitting it later. |
What problem does this PR solve?
Issue Number: ref #8690
Backports the
QueryRegiongRPC stream torelease-8.5, including both:This lets release-8.5 PD serve and consume batched region queries (key / prev-key / ID lookups) through the same QueryRegion stream semantics used on master, while keeping the feature disabled by default on the client side.
Dependency
Requires the kvproto QueryRegion proto on the
release-8.5line:pingcap/kvproto#1467 (commit
f96c651) must merge first. This PR pinskvproto to
v0.0.0-20260518092652-f96c651c7702(kvproto release-8.5 tipwith the QueryRegion proto cherry-picked and
pdpb.pb.goregenerated usingthe CI-pinned protoc 3.8.0 + gogo v1.3.2 toolchain, so release-8.5-only
protos such as
StoreStats.NetworkSlowScoresare retained).What is changed and how does it work?
This PR contains eight commits:
chore: bump kvproto for QueryRegion backport— pin kvproto across all four modules.feat: backport QueryRegion gRPC server to release-8.5— backport the server-side QueryRegion path:Adds
RegionsInfo.QueryRegions,regionTree.searchByKeys/searchByPrevKeys,the
GrpcServer.QueryRegionstream handler and server metric. TherateLimitCheck/errs.ErrMismatchClusterIDhelpers (added laterupstream by server: use rateLimitCheck to reduce code #8995) are adapted to release-8.5 inline idioms
(
currentFunctionlimiter +status.Errorf).regionsByIDwithlen(regions)+len(prevRegions)+len(ids).pkg/corequery_region_*vecs, server
query_region_duration_seconds, per-step metrics and guards).branch using
s.cluster+ region syncer, mirroring theGetRegionfollower path).
(
slice.SplitIntoBatches,getRegionsByIDs,batchSearchSize16→128,drop panic assertions).
refactor(client): genericize TSO batch controller— backport master-alignedclient/pkg/batch.Controller[T]and migrate the 8.5 TSO dispatcher onto it.The generic controller is kept aligned with master; 8.5-specific deltas are
limited to the testutil import path and preserving 8.5 TSO
suffixBitsin theTSO request finisher.
refactor(client): genericize TSO connection context manager— backportmaster-aligned
client/pkg/connectionctx.Manager[T]and migrate the 8.5 TSOstream connection map onto it. The generic manager is kept aligned with master;
8.5 keeps the existing per-dispatcher manager shape to preserve dc-local TSO behavior.
feat(client): backport QueryRegion stream client— backport the client-sideQueryRegion stream path on top of the generic batch and connectionctx helpers:
GetRegion,GetPrevRegion, andGetRegionByIDuse the stream path whenEnableQueryRegionis enabled, and fall back to unary RPC on stream errors.EnableQueryRegionis dynamic and defaults tofalse.release-8.5 target only needs the PD leader/follower path.
fix(client): update QueryRegion follower streams on option change— keepthe release-8.5 stream client aligned with master follower-handle behavior by
notifying the QueryRegion connection daemon when
EnableFollowerHandlechanges.test(client): backport QueryRegion client coverage— backport/adapt themaster QueryRegion client unit and integration coverage to release-8.5. The
router/microservice integration tests are intentionally not included because
release-8.5 does not backport
routerpbor the router microservice path.fix(client): stop QueryRegion client on disable— mirror the master routerclient initializer lifecycle (client: support dynamic start/stop of the router client #9082) for the release-8.5 QueryRegion client:
enabling creates the stream client through the initializer, disabling closes
it after releasing the client lock, and repeated same-value updates do not
emit duplicate notifications. Release-8.5-specific deltas are limited to
keeping the local option setter style (get-then-store instead of master CAS)
and explicitly draining the QueryRegion client during
Close()after theparent context is canceled; this is deterministic shutdown for the 8.5
in-package client shape and does not change request behavior.
This PR does not flip
defaultEnableQueryRegion.Check List
Tests
pkg/coreTestQueryRegions,pkg/sliceTestSliceSplitIntoBatches, plus regenerated benchmarkBenchmarkQueryRegions)client/pkg/batch,client/pkg/connectionctx, QueryRegion client request/finisher/fallback tests, including the master no-data-race finisher case)tests/integrations/clientQueryRegion client enabled/disabled suites andTestGetRegionByQueryRegionStream)Release note