Skip to content

feat(BA-6032): implement PROVISIONING sub-status pipeline for routes#11613

Merged
HyeockJinKim merged 3 commits into
mainfrom
fix/BA-6032-provisioning-sub-status
May 14, 2026
Merged

feat(BA-6032): implement PROVISIONING sub-status pipeline for routes#11613
HyeockJinKim merged 3 commits into
mainfrom
fix/BA-6032-provisioning-sub-status

Conversation

@HyeockJinKim
Copy link
Copy Markdown
Collaborator

Summary

  • Implement three-stage PROVISIONING pipeline: PENDING → STARTING → WARMING_UP → RUNNING
    • ProvisioningHandler: targets PROVISIONING+PENDING, enqueues session → sub_status=STARTING
    • StartingHandler (new): targets PROVISIONING+STARTING, waits for kernel host/port via single JOIN query → sub_status=WARMING_UP
    • WarmingUpHandler (new): targets PROVISIONING+WARMING_UP, initial health probe → RUNNING+ACTIVE
    • RunningHandler: simplified to session liveness check only (RUNNING routes only)
  • Add ReplicaID typed identifier (NewType of UUID) for RoutingRow.id and propagate through repository/executor layer
  • Replace raw tuples with RouteSessionInfo / RouteSessionKernelInfo dataclasses for kernel connection info
  • Add RouteHandlerCategory.SYNC for ServiceDiscoverySyncHandler and AppProxySyncRouteHandler
  • check_running_routes() Phase 3/4 removed — replica info and health records now initialized at STARTING→WARMING_UP transition

Test plan

  • New route creation → sub_status='pending'
  • ProvisioningHandler tick → sub_status='starting'
  • StartingHandler tick (host/port ready) → sub_status='warming_up'
  • WarmingUpHandler tick (health pass) → status='running', traffic_status='active'
  • Session terminal in STARTING → route → TERMINATING
  • Session RUNNING but no inference port → route → TERMINATING
  • WarmingUpHandler: initial_delay exceeded without health pass → route → TERMINATING

Resolves BA-6032
Resolves BA-6033
Resolves BA-6034

HyeockJinKim and others added 2 commits May 14, 2026 18:34
- Add StartingHandler (PROVISIONING+STARTING → WARMING_UP)
- Add WarmingUpHandler (PROVISIONING+WARMING_UP → RUNNING)
- ProvisioningHandler now targets PENDING sub_status only
- RunningHandler simplified to session liveness check only
- RouteHandlerCategory.SYNC added for service discovery / AppProxy sync
- WarmingUpHandler.category changed to LIFECYCLE
- ReplicaID typed identifier applied to RoutingRow.id and related types
- RouteSessionInfo/RouteSessionKernelInfo replace raw tuples for kernel info
- fetch_route_session_kernel_infos: single JOIN query replaces two round trips
- check_running_routes Phase 3/4 removed (handled by StartingHandler)

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
… adapter

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
@HyeockJinKim HyeockJinKim requested a review from a team as a code owner May 14, 2026 09:45
Copilot AI review requested due to automatic review settings May 14, 2026 09:45
@github-actions github-actions Bot added the size:XL 500~ LoC label May 14, 2026
@github-actions github-actions Bot added comp:manager Related to Manager component comp:common Related to Common component labels May 14, 2026
Copy link
Copy Markdown
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Implements a staged route provisioning pipeline and introduces a typed ReplicaID identifier across route repository/executor/history layers.

Changes:

  • Adds PENDING → STARTING → WARMING_UP → RUNNING route lifecycle handling with new Starting/WarmingUp handlers and coordinator tasks.
  • Adds ReplicaID and propagates typed route IDs through repository, scheduling history, API adapter, and tests.
  • Refactors replica/kernel connection info into dataclasses and changes sync handler categorization to SYNC.

Reviewed changes

Copilot reviewed 36 out of 36 changed files in this pull request and generated 10 comments.

Show a summary per file
File Description
src/ai/backend/common/identifier/replica.py Adds typed ReplicaID.
src/ai/backend/manager/data/deployment/types.py Adds SYNC route handler category.
src/ai/backend/manager/models/routing/row.py Types routing row IDs as ReplicaID and removes custom initializer.
src/ai/backend/manager/models/scheduling_history/row.py Types route history route IDs as ReplicaID.
src/ai/backend/manager/repositories/deployment/creators/route.py Creates routes with provisioning PENDING sub-status.
src/ai/backend/manager/repositories/deployment/db_source/db_source.py Adds route/session/kernel info fetch and typed replica updates.
src/ai/backend/manager/repositories/deployment/repository.py Exposes typed route/session/kernel repository APIs.
src/ai/backend/manager/repositories/deployment/types/endpoint.py Updates RouteData IDs and adds session/kernel info dataclasses.
src/ai/backend/manager/repositories/deployment/types/__init__.py Re-exports new route session info dataclasses.
src/ai/backend/manager/repositories/scheduling_history/creators.py Types route history creator specs with ReplicaID.
src/ai/backend/manager/repositories/scheduling_history/types.py Types route-scoped history search with ReplicaID.
src/ai/backend/manager/api/adapters/scheduling_history/adapter.py Converts route history API IDs to ReplicaID.
src/ai/backend/manager/sokovan/deployment/route/types.py Adds lifecycle phases for STARTING/WARMING_UP checks.
src/ai/backend/manager/sokovan/deployment/route/coordinator.py Registers new handlers and periodic task specs.
src/ai/backend/manager/sokovan/deployment/route/executor.py Implements STARTING/WARMING_UP checks and updates health record initialization.
src/ai/backend/manager/sokovan/deployment/route/handlers/__init__.py Exports running/starting/warming handlers.
src/ai/backend/manager/sokovan/deployment/route/handlers/provisioning.py Narrows provisioning to PENDING routes and moves success to STARTING.
src/ai/backend/manager/sokovan/deployment/route/handlers/starting.py Adds STARTING readiness handler.
src/ai/backend/manager/sokovan/deployment/route/handlers/warming_up.py Adds WARMING_UP activation handler.
src/ai/backend/manager/sokovan/deployment/route/handlers/running.py Simplifies running checks to RUNNING lifecycle routes.
src/ai/backend/manager/sokovan/deployment/route/handlers/appproxy_sync.py Categorizes AppProxy sync as SYNC.
src/ai/backend/manager/sokovan/deployment/route/handlers/service_discovery_sync.py Categorizes service discovery sync as SYNC.
Test files Update route IDs in fixtures/assertions to use ReplicaID and new kernel info dataclass where needed.

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Comment on lines +418 to +422
if record.last_check > 0 and not record.is_stale(current_time) and record.healthy:
successes.append(route)
continue

if current_time > record.initial_delay_until:
Comment on lines +66 to +69
success=RouteTransitionTarget(
status=RouteStatus.RUNNING,
traffic_status=RouteTrafficStatus.ACTIVE,
),
"""
return RouteStatusTransitions(
success=RouteTransitionTarget(
status=RouteStatus.RUNNING,
sub_status=RouteSubStatus.STARTING,
),
failure=RouteTransitionTarget(
status=RouteStatus.FAILED_TO_START,
sub_status=RouteSubStatus.WARMING_UP,
),
failure=RouteTransitionTarget(
status=RouteStatus.TERMINATING,
route_id_str = str(route.route_id)
record = records.get(route_id_str)

if record is None:
Comment on lines +82 to +84
"Warming-up check: {} routes activated (→ running), {} still probing",
len(result.successes),
len(result.errors) + len(result.stale),
Comment on lines +28 to +30
Success (→ WARMING_UP): session is running and host/port is populated.
Failure (→ FAILED_TO_START): session terminated or not found.
Routes running but without host/port yet are silently retried next tick.
Comment on lines +227 to +241
async def check_starting_routes(self, routes: Sequence[RouteData]) -> RouteExecutionResult:
"""Check if STARTING routes have their sessions ready.

Queries session status and kernel connection info for routes whose sessions
are being provisioned. Transitions routes to:
- success (replica info ready): session is RUNNING with an inference port
- error (session terminated): session reached a terminal status
- skip (still starting): session is not yet RUNNING

Args:
routes: Routes in STARTING state to check

Returns:
Result containing routes that are ready (success) or failed (error)
"""
Comment on lines +387 to +393
async def check_warming_up_health(self, routes: Sequence[RouteData]) -> RouteExecutionResult:
"""Check health of PROVISIONING+WARMING_UP routes for initial activation.

- success: health probe passed, or no health check configured → RUNNING+ACTIVE
- failure: initial_delay exceeded without a passing probe → TERMINATING
- (no transition): still within initial_delay → route stays WARMING_UP
"""
@HyeockJinKim HyeockJinKim merged commit 584d934 into main May 14, 2026
36 checks passed
@HyeockJinKim HyeockJinKim deleted the fix/BA-6032-provisioning-sub-status branch May 14, 2026 11:27
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

comp:common Related to Common component comp:manager Related to Manager component size:XL 500~ LoC

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants