feat(BA-6032): implement PROVISIONING sub-status pipeline for routes#11613
Merged
Conversation
- Add StartingHandler (PROVISIONING+STARTING → WARMING_UP) - Add WarmingUpHandler (PROVISIONING+WARMING_UP → RUNNING) - ProvisioningHandler now targets PENDING sub_status only - RunningHandler simplified to session liveness check only - RouteHandlerCategory.SYNC added for service discovery / AppProxy sync - WarmingUpHandler.category changed to LIFECYCLE - ReplicaID typed identifier applied to RoutingRow.id and related types - RouteSessionInfo/RouteSessionKernelInfo replace raw tuples for kernel info - fetch_route_session_kernel_infos: single JOIN query replaces two round trips - check_running_routes Phase 3/4 removed (handled by StartingHandler) Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
… adapter Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
Contributor
There was a problem hiding this comment.
Pull request overview
Implements a staged route provisioning pipeline and introduces a typed ReplicaID identifier across route repository/executor/history layers.
Changes:
- Adds
PENDING → STARTING → WARMING_UP → RUNNINGroute lifecycle handling with new Starting/WarmingUp handlers and coordinator tasks. - Adds
ReplicaIDand propagates typed route IDs through repository, scheduling history, API adapter, and tests. - Refactors replica/kernel connection info into dataclasses and changes sync handler categorization to
SYNC.
Reviewed changes
Copilot reviewed 36 out of 36 changed files in this pull request and generated 10 comments.
Show a summary per file
| File | Description |
|---|---|
src/ai/backend/common/identifier/replica.py |
Adds typed ReplicaID. |
src/ai/backend/manager/data/deployment/types.py |
Adds SYNC route handler category. |
src/ai/backend/manager/models/routing/row.py |
Types routing row IDs as ReplicaID and removes custom initializer. |
src/ai/backend/manager/models/scheduling_history/row.py |
Types route history route IDs as ReplicaID. |
src/ai/backend/manager/repositories/deployment/creators/route.py |
Creates routes with provisioning PENDING sub-status. |
src/ai/backend/manager/repositories/deployment/db_source/db_source.py |
Adds route/session/kernel info fetch and typed replica updates. |
src/ai/backend/manager/repositories/deployment/repository.py |
Exposes typed route/session/kernel repository APIs. |
src/ai/backend/manager/repositories/deployment/types/endpoint.py |
Updates RouteData IDs and adds session/kernel info dataclasses. |
src/ai/backend/manager/repositories/deployment/types/__init__.py |
Re-exports new route session info dataclasses. |
src/ai/backend/manager/repositories/scheduling_history/creators.py |
Types route history creator specs with ReplicaID. |
src/ai/backend/manager/repositories/scheduling_history/types.py |
Types route-scoped history search with ReplicaID. |
src/ai/backend/manager/api/adapters/scheduling_history/adapter.py |
Converts route history API IDs to ReplicaID. |
src/ai/backend/manager/sokovan/deployment/route/types.py |
Adds lifecycle phases for STARTING/WARMING_UP checks. |
src/ai/backend/manager/sokovan/deployment/route/coordinator.py |
Registers new handlers and periodic task specs. |
src/ai/backend/manager/sokovan/deployment/route/executor.py |
Implements STARTING/WARMING_UP checks and updates health record initialization. |
src/ai/backend/manager/sokovan/deployment/route/handlers/__init__.py |
Exports running/starting/warming handlers. |
src/ai/backend/manager/sokovan/deployment/route/handlers/provisioning.py |
Narrows provisioning to PENDING routes and moves success to STARTING. |
src/ai/backend/manager/sokovan/deployment/route/handlers/starting.py |
Adds STARTING readiness handler. |
src/ai/backend/manager/sokovan/deployment/route/handlers/warming_up.py |
Adds WARMING_UP activation handler. |
src/ai/backend/manager/sokovan/deployment/route/handlers/running.py |
Simplifies running checks to RUNNING lifecycle routes. |
src/ai/backend/manager/sokovan/deployment/route/handlers/appproxy_sync.py |
Categorizes AppProxy sync as SYNC. |
src/ai/backend/manager/sokovan/deployment/route/handlers/service_discovery_sync.py |
Categorizes service discovery sync as SYNC. |
| Test files | Update route IDs in fixtures/assertions to use ReplicaID and new kernel info dataclass where needed. |
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
Comment on lines
+418
to
+422
| if record.last_check > 0 and not record.is_stale(current_time) and record.healthy: | ||
| successes.append(route) | ||
| continue | ||
|
|
||
| if current_time > record.initial_delay_until: |
Comment on lines
+66
to
+69
| success=RouteTransitionTarget( | ||
| status=RouteStatus.RUNNING, | ||
| traffic_status=RouteTrafficStatus.ACTIVE, | ||
| ), |
| """ | ||
| return RouteStatusTransitions( | ||
| success=RouteTransitionTarget( | ||
| status=RouteStatus.RUNNING, |
| sub_status=RouteSubStatus.STARTING, | ||
| ), | ||
| failure=RouteTransitionTarget( | ||
| status=RouteStatus.FAILED_TO_START, |
| sub_status=RouteSubStatus.WARMING_UP, | ||
| ), | ||
| failure=RouteTransitionTarget( | ||
| status=RouteStatus.TERMINATING, |
| route_id_str = str(route.route_id) | ||
| record = records.get(route_id_str) | ||
|
|
||
| if record is None: |
Comment on lines
+82
to
+84
| "Warming-up check: {} routes activated (→ running), {} still probing", | ||
| len(result.successes), | ||
| len(result.errors) + len(result.stale), |
Comment on lines
+28
to
+30
| Success (→ WARMING_UP): session is running and host/port is populated. | ||
| Failure (→ FAILED_TO_START): session terminated or not found. | ||
| Routes running but without host/port yet are silently retried next tick. |
Comment on lines
+227
to
+241
| async def check_starting_routes(self, routes: Sequence[RouteData]) -> RouteExecutionResult: | ||
| """Check if STARTING routes have their sessions ready. | ||
|
|
||
| Queries session status and kernel connection info for routes whose sessions | ||
| are being provisioned. Transitions routes to: | ||
| - success (replica info ready): session is RUNNING with an inference port | ||
| - error (session terminated): session reached a terminal status | ||
| - skip (still starting): session is not yet RUNNING | ||
|
|
||
| Args: | ||
| routes: Routes in STARTING state to check | ||
|
|
||
| Returns: | ||
| Result containing routes that are ready (success) or failed (error) | ||
| """ |
Comment on lines
+387
to
+393
| async def check_warming_up_health(self, routes: Sequence[RouteData]) -> RouteExecutionResult: | ||
| """Check health of PROVISIONING+WARMING_UP routes for initial activation. | ||
|
|
||
| - success: health probe passed, or no health check configured → RUNNING+ACTIVE | ||
| - failure: initial_delay exceeded without a passing probe → TERMINATING | ||
| - (no transition): still within initial_delay → route stays WARMING_UP | ||
| """ |
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Summary
PENDING → STARTING → WARMING_UP → RUNNINGPROVISIONING+PENDING, enqueues session →sub_status=STARTINGPROVISIONING+STARTING, waits for kernel host/port via single JOIN query →sub_status=WARMING_UPPROVISIONING+WARMING_UP, initial health probe →RUNNING+ACTIVEReplicaIDtyped identifier (NewTypeofUUID) forRoutingRow.idand propagate through repository/executor layerRouteSessionInfo/RouteSessionKernelInfodataclasses for kernel connection infoRouteHandlerCategory.SYNCforServiceDiscoverySyncHandlerandAppProxySyncRouteHandlercheck_running_routes()Phase 3/4 removed — replica info and health records now initialized at STARTING→WARMING_UP transitionTest plan
sub_status='pending'sub_status='starting'sub_status='warming_up'status='running',traffic_status='active'TERMINATINGTERMINATINGTERMINATINGResolves BA-6032
Resolves BA-6033
Resolves BA-6034