Skip to content

Restore zero initial scale behavior on master branch#709

Merged
openshift-merge-bot[bot] merged 4 commits into
opendatahub-io:masterfrom
brettmthompson:feature/restore-zero-initial-scale-behavior-master
Jul 3, 2025
Merged

Restore zero initial scale behavior on master branch#709
openshift-merge-bot[bot] merged 4 commits into
opendatahub-io:masterfrom
brettmthompson:feature/restore-zero-initial-scale-behavior-master

Conversation

@brettmthompson
Copy link
Copy Markdown

@brettmthompson brettmthompson commented Jun 30, 2025

What this PR does / why we need it:
RHOAIENG-27519

Restoring ODH specific zero initial scale behavior enabled in #537 after syncing upstream initial scale behavior

Type of changes
Please delete options that are not relevant.

  • Bug fix (non-breaking change which fixes an issue)
  • New feature (non-breaking change which adds functionality)
  • Breaking change (fix or feature that would cause existing functionality to not work as expected)
  • This change requires a documentation update

Feature/Issue validation/testing:

Please describe the tests that you ran to verify your changes and relevant result summary. Provide instructions so it can be reproduced.
Please also list any relevant details for your test configuration.

Checklist:

  • Have you added unit/e2e tests that prove your fix is effective or that this feature works?
  • Has code been commented, particularly in hard-to-understand areas?
  • Have you made corresponding changes to the documentation?
  • Have you linked the JIRA issue(s) to this PR?

Release note:

None

Re-running failed tests

  • /rerun-all - rerun all failed workflows.
  • /rerun-workflow <workflow name> - rerun a specific failed workflow. Only one workflow name can be specified. Multiple /rerun-workflow commands are allowed per comment.

Summary by CodeRabbit

  • New Features

    • Improved support for configuring zero initial scale in serverless inference services and inference graphs, allowing explicit control over initial scale behavior when minimum replicas are set to zero.
  • Bug Fixes

    • Updated validation to ensure initial scale annotations are set correctly based on configuration and minimum replicas.
  • Chores

    • Removed outdated permissions and dependencies related to Knative Operator.
    • Cleaned up configuration and documentation to reflect current annotation handling.
  • Tests

    • Added new test cases to verify correct behavior for zero initial scale scenarios.

Signed-off-by: Brett Thompson <196701379+brettmthompson@users.noreply.github.com>
Signed-off-by: Brett Thompson <196701379+brettmthompson@users.noreply.github.com>
@coderabbitai
Copy link
Copy Markdown

coderabbitai Bot commented Jun 30, 2025

Walkthrough

This update removes all dependencies, RBAC permissions, and runtime checks related to the Knative Operator and its KnativeServing CRD from the codebase. It also updates logic for the autoscaling.knative.dev/initial-scale annotation, allowing it when appropriate and adjusting related configuration, validation, and test cases. Constructors for key components now accept an allowZeroInitialScale parameter.

Changes

Files/Paths Change Summary
charts/kserve-resources/templates/clusterrole.yaml, config/rbac/role.yaml Removed RBAC permissions for knativeservings in operator.knative.dev API group.
go.mod, cmd/manager/main.go, pkg/testing/envtest_setup.go Removed Knative Operator dependency and related scheme registration/imports.
config/configmap/inferenceservice.yaml, config/overlays/odh/inferenceservice-config-patch.yaml, config/overlays/test/configmap/inferenceservice-openshift-ci-serverless-predictor.yaml Removed "autoscaling.knative.dev/initial-scale" from disallowed service annotations in config.
pkg/constants/constants.go Removed KnativeServingKind constant and initial scale annotation from disallowed list.
pkg/controller/v1alpha1/inferencegraph/controller.go Removed KnativeServing CRD check; updated validation logic to use MinReplicas.
pkg/controller/v1alpha1/inferencegraph/controller_test.go, pkg/controller/v1beta1/inferenceservice/controller_test.go Added test cases for zero initial scale behavior with/without annotation; minor test renaming.
pkg/controller/v1alpha1/utils/utils.go Updated ValidateInitialScaleAnnotation to accept minReplicas and set annotation based on logic.
pkg/controller/v1beta1/inferenceservice/components/predictor.go, pkg/controller/v1beta1/inferenceservice/components/transformer.go, pkg/controller/v1beta1/inferenceservice/components/explainer.go Added allowZeroInitialScale field; updated constructors and logic to use it for annotation validation.
pkg/controller/v1beta1/inferenceservice/controller.go Removed KnativeServing CRD logic; updated component constructors to use new parameter.
pkg/openapi/openapi_generated.go Removed extraneous blank lines from OpenAPI schema functions (formatting only).

Sequence Diagram(s)

sequenceDiagram
    participant User
    participant Controller
    participant KnativeAPI

    User->>Controller: Create InferenceService/InferenceGraph (MinReplicas=0)
    Controller->>Controller: Check allowZeroInitialScale flag
    alt allowZeroInitialScale is true
        Controller->>Controller: Set initialScale annotation to "0"
    else allowZeroInitialScale is false
        Controller->>Controller: Do not set initialScale annotation
    end
    Controller->>KnativeAPI: Deploy Knative Service with annotations
Loading

Poem

In the warren, code is neat,
Old dependencies face defeat.
Knative Operator hops away,
While zero scale can now stay.
RBAC rules trimmed with care,
Tests ensure all works fair.
🐇 Hopping forward, light and fleet!


📜 Recent review details

Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between 053cf51 and 087e4a2.

📒 Files selected for processing (2)
  • pkg/controller/v1beta1/inferenceservice/components/explainer.go (3 hunks)
  • pkg/controller/v1beta1/inferenceservice/controller_test.go (5 hunks)
🚧 Files skipped from review as they are similar to previous changes (2)
  • pkg/controller/v1beta1/inferenceservice/components/explainer.go
  • pkg/controller/v1beta1/inferenceservice/controller_test.go
⏰ Context from checks skipped due to timeout of 90000ms (12)
  • GitHub Check: precommit-check
  • GitHub Check: test
  • GitHub Check: test
  • GitHub Check: test
  • GitHub Check: build (3.9)
  • GitHub Check: build (3.10)
  • GitHub Check: build (3.12)
  • GitHub Check: build (3.11)
  • GitHub Check: test
  • GitHub Check: test
  • GitHub Check: test
  • GitHub Check: Build
✨ Finishing Touches
  • 📝 Generate Docstrings

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share
🪧 Tips

Chat

There are 3 ways to chat with CodeRabbit:

  • Review comments: Directly reply to a review comment made by CodeRabbit. Example:
    • I pushed a fix in commit <commit_id>, please review it.
    • Explain this complex logic.
    • Open a follow-up GitHub issue for this discussion.
  • Files and specific lines of code (under the "Files changed" tab): Tag @coderabbitai in a new review comment at the desired location with your query. Examples:
    • @coderabbitai explain this code block.
    • @coderabbitai modularize this function.
  • PR comments: Tag @coderabbitai in a new PR comment to ask questions about the PR branch. For the best results, please provide a very specific query, as very limited context is provided in this mode. Examples:
    • @coderabbitai gather interesting stats about this repository and render them as a table. Additionally, render a pie chart showing the language distribution in the codebase.
    • @coderabbitai read src/utils.ts and explain its main purpose.
    • @coderabbitai read the files in the src/scheduler package and generate a class diagram using mermaid and a README in the markdown format.
    • @coderabbitai help me debug CodeRabbit configuration file.

Support

Need help? Create a ticket on our support page for assistance with any issues or questions.

Note: Be mindful of the bot's finite context window. It's strongly recommended to break down tasks such as reading entire modules into smaller chunks. For a focused discussion, use review comments to chat about specific files and their changes, instead of using the PR comments.

CodeRabbit Commands (Invoked using PR comments)

  • @coderabbitai pause to pause the reviews on a PR.
  • @coderabbitai resume to resume the paused reviews.
  • @coderabbitai review to trigger an incremental review. This is useful when automatic reviews are disabled for the repository.
  • @coderabbitai full review to do a full review from scratch and review all the files again.
  • @coderabbitai summary to regenerate the summary of the PR.
  • @coderabbitai generate docstrings to generate docstrings for this PR.
  • @coderabbitai generate sequence diagram to generate a sequence diagram of the changes in this PR.
  • @coderabbitai resolve resolve all the CodeRabbit review comments.
  • @coderabbitai configuration to show the current CodeRabbit configuration for the repository.
  • @coderabbitai help to get help.

Other keywords and placeholders

  • Add @coderabbitai ignore anywhere in the PR description to prevent this PR from being reviewed.
  • Add @coderabbitai summary to generate the high-level summary at a specific location in the PR description.
  • Add @coderabbitai anywhere in the PR title to generate the title automatically.

CodeRabbit Configuration File (.coderabbit.yaml)

  • You can programmatically configure CodeRabbit by adding a .coderabbit.yaml file to the root of your repository.
  • Please see the configuration documentation for more information.
  • If your editor has YAML language server enabled, you can add the path at the top of this file to enable auto-completion and validation: # yaml-language-server: $schema=https://coderabbit.ai/integrations/schema.v2.json

Documentation and Community

  • Visit our Documentation for detailed information on how to use CodeRabbit.
  • Join our Discord Community to get help, request features, and share feedback.
  • Follow us on X/Twitter for updates and announcements.

Copy link
Copy Markdown

@coderabbitai coderabbitai Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 1

📜 Review details

Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between eab8010 and 053cf51.

⛔ Files ignored due to path filters (1)
  • go.sum is excluded by !**/*.sum
📒 Files selected for processing (18)
  • charts/kserve-resources/templates/clusterrole.yaml (0 hunks)
  • cmd/manager/main.go (0 hunks)
  • config/configmap/inferenceservice.yaml (0 hunks)
  • config/overlays/odh/inferenceservice-config-patch.yaml (0 hunks)
  • config/overlays/test/configmap/inferenceservice-openshift-ci-serverless-predictor.yaml (0 hunks)
  • config/rbac/role.yaml (0 hunks)
  • go.mod (0 hunks)
  • pkg/constants/constants.go (0 hunks)
  • pkg/controller/v1alpha1/inferencegraph/controller.go (2 hunks)
  • pkg/controller/v1alpha1/inferencegraph/controller_test.go (5 hunks)
  • pkg/controller/v1alpha1/utils/utils.go (1 hunks)
  • pkg/controller/v1beta1/inferenceservice/components/explainer.go (3 hunks)
  • pkg/controller/v1beta1/inferenceservice/components/predictor.go (3 hunks)
  • pkg/controller/v1beta1/inferenceservice/components/transformer.go (3 hunks)
  • pkg/controller/v1beta1/inferenceservice/controller.go (3 hunks)
  • pkg/controller/v1beta1/inferenceservice/controller_test.go (5 hunks)
  • pkg/openapi/openapi_generated.go (0 hunks)
  • pkg/testing/envtest_setup.go (0 hunks)
💤 Files with no reviewable changes (10)
  • config/configmap/inferenceservice.yaml
  • config/rbac/role.yaml
  • go.mod
  • pkg/testing/envtest_setup.go
  • charts/kserve-resources/templates/clusterrole.yaml
  • config/overlays/test/configmap/inferenceservice-openshift-ci-serverless-predictor.yaml
  • config/overlays/odh/inferenceservice-config-patch.yaml
  • cmd/manager/main.go
  • pkg/constants/constants.go
  • pkg/openapi/openapi_generated.go
🧰 Additional context used
🧬 Code Graph Analysis (6)
pkg/controller/v1alpha1/utils/utils.go (1)
pkg/constants/constants.go (1)
  • DefaultMinReplicas (174-174)
pkg/controller/v1alpha1/inferencegraph/controller.go (1)
pkg/controller/v1alpha1/utils/utils.go (1)
  • ValidateInitialScaleAnnotation (114-167)
pkg/controller/v1beta1/inferenceservice/components/transformer.go (1)
pkg/controller/v1alpha1/utils/utils.go (1)
  • ValidateInitialScaleAnnotation (114-167)
pkg/controller/v1alpha1/inferencegraph/controller_test.go (2)
pkg/constants/constants.go (3)
  • InferenceServiceConfigMapName (51-51)
  • KServeNamespace (41-41)
  • Serverless (441-441)
pkg/apis/serving/v1alpha1/inference_graph.go (7)
  • InferenceGraph (35-40)
  • InferenceGraphSpec (44-89)
  • InferenceRouter (256-272)
  • GraphRootNodeName (117-117)
  • Sequence (103-103)
  • InferenceStep (305-331)
  • InferenceTarget (276-287)
pkg/controller/v1beta1/inferenceservice/components/explainer.go (1)
pkg/controller/v1alpha1/utils/utils.go (1)
  • ValidateInitialScaleAnnotation (114-167)
pkg/controller/v1beta1/inferenceservice/components/predictor.go (1)
pkg/controller/v1alpha1/utils/utils.go (1)
  • ValidateInitialScaleAnnotation (114-167)
🔇 Additional comments (24)
pkg/controller/v1alpha1/inferencegraph/controller.go (2)

286-286: LGTM! Error message properly updated.

The error message correctly reflects the removal of the KnativeServing CRD dependency while maintaining clarity about the requirement for Knative Serving.


295-295: LGTM! Function call updated with required parameter.

The call to ValidateInitialScaleAnnotation correctly includes the graph.Spec.MinReplicas parameter, ensuring the new zero initial scale logic works properly for InferenceGraphs.

pkg/controller/v1beta1/inferenceservice/controller.go (4)

203-204: LGTM! Improved variable declaration and comment.

The variable declaration is clearer and the comment better describes the early abort condition for Serverless mode.


214-214: LGTM! Error message properly updated.

The error message correctly reflects the simplified check for Knative Services availability.


218-218: LGTM! Proper variable assignment without redeclaration.

The variable assignment correctly avoids redeclaring allowZeroInitialScale while assigning the result from CheckZeroInitialScaleAllowed.


235-241: LGTM! Constructor calls properly updated.

All component constructors (Predictor, Transformer, Explainer) now correctly receive the allowZeroInitialScale parameter, ensuring consistent behavior across components.

pkg/controller/v1beta1/inferenceservice/components/explainer.go (4)

35-35: LGTM! Required import added.

The import for knutils is correctly added to support the validation function call.


51-51: LGTM! New field properly added.

The allowZeroInitialScale boolean field is appropriately added to the Explainer struct.


56-56: LGTM! Constructor properly updated.

The constructor signature and field assignment correctly handle the new allowZeroInitialScale parameter.

Also applies to: 64-64


175-175: LGTM! Validation call properly placed.

The call to ValidateInitialScaleAnnotation is correctly placed in the Knative deployment path, passing all required parameters including isvc.Spec.Explainer.MinReplicas.

pkg/controller/v1beta1/inferenceservice/components/transformer.go (4)

38-38: LGTM! Required import added.

The import for knutils is correctly added to support the validation function call.


54-54: LGTM! New field properly added.

The allowZeroInitialScale boolean field is appropriately added to the Transformer struct.


59-59: LGTM! Constructor properly updated.

The constructor signature and field assignment correctly handle the new allowZeroInitialScale parameter, maintaining consistency with other components.

Also applies to: 67-67


237-237: LGTM! Validation call properly placed.

The call to ValidateInitialScaleAnnotation is correctly placed in the reconcileTransformerKnativeDeployment method, ensuring it only applies to Knative deployments and passes all required parameters including isvc.Spec.Transformer.MinReplicas.

pkg/controller/v1beta1/inferenceservice/components/predictor.go (4)

43-43: LGTM - Importing utility functions for validation.

The import of knutils from the v1alpha1 package is appropriate for accessing the shared validation utilities.


66-66: LGTM - Clean addition of configuration field.

The allowZeroInitialScale boolean field follows Go naming conventions and integrates well with the existing struct design.


71-71: LGTM - Constructor properly updated.

The constructor signature and field assignment correctly implement the new parameter. The change maintains consistency with the existing constructor pattern.

Also applies to: 79-79


720-720: LGTM - Proper validation implementation.

The call to ValidateInitialScaleAnnotation is correctly placed and passes all required parameters:

  • objectMeta.Annotations: The annotations map to validate/modify
  • p.allowZeroInitialScale: The configuration flag
  • isvc.Spec.Predictor.MinReplicas: The minimum replicas setting
  • p.Log: The logger instance

This implementation aligns with the PR objective to restore zero initial scale behavior and follows the validation logic described in the utility function.

pkg/controller/v1beta1/inferenceservice/controller_test.go (3)

284-342: LGTM! Well-structured test for zero initial scale when disallowed.

This test case properly verifies that when Knative is configured to not allow zero initial scale, an InferenceService with zero min replicas does not result in the autoscaling.knative.dev/initial-scale annotation being set. The test follows established patterns with proper setup, resource creation, and assertion.


556-615: LGTM! Comprehensive test for zero initial scale when allowed.

This test case effectively verifies that when Knative is configured to allow zero initial scale, an InferenceService with zero min replicas correctly sets the autoscaling.knative.dev/initial-scale annotation to "0". The test complements the previous test case and together they provide complete coverage for the zero initial scale functionality. The implementation follows best practices with proper setup, teardown, and assertions.


387-387: Good practice: Service name updates for test isolation.

The service name increments (initialscale5, initialscale6, initialscale7, initialscale8) properly isolate the new test cases from existing ones, preventing resource conflicts during test execution. This follows the established naming pattern in the test suite.

Also applies to: 448-448, 509-509, 570-570

pkg/controller/v1alpha1/inferencegraph/controller_test.go (3)

251-304: LGTM! Comprehensive test for zero min replicas behavior when zero initial scale is disallowed.

This test correctly verifies that when MinReplicas is set to 0 but zero initial scale is not allowed in the Knative configuration, the autoscaling.knative.dev/initialScale annotation is not set on the resulting Knative Service. The test structure follows the established pattern and provides good coverage for this scenario.


348-348: Good practice: Updated graph names to avoid test conflicts.

The graph names have been appropriately updated from "initialscale1/2/3" to "initialscale5/6/7" to prevent conflicts with the newly added test cases. This ensures test isolation and prevents potential race conditions.

Also applies to: 403-403, 458-458


500-553: LGTM! Comprehensive test for zero min replicas behavior when zero initial scale is allowed.

This test correctly verifies that when MinReplicas is set to 0 and zero initial scale is allowed in the Knative configuration, the autoscaling.knative.dev/initialScale annotation is explicitly set to "0" on the resulting Knative Service. This test case complements the previous test and provides complete coverage for the zero initial scale feature restoration.

The test logic properly:

  • Sets up the environment with zero initial scale allowed
  • Creates an InferenceGraph with MinReplicas set to 0
  • Verifies the annotation is set to "0" as expected

Comment thread pkg/controller/v1alpha1/utils/utils.go
…nitial-scale-behavior-master

Signed-off-by: Brett Thompson <196701379+brettmthompson@users.noreply.github.com>
This reverts commit 053cf51 because precommit does not like the changes.

Signed-off-by: Brett Thompson <196701379+brettmthompson@users.noreply.github.com>
@brettmthompson brettmthompson force-pushed the feature/restore-zero-initial-scale-behavior-master branch from 3ecb026 to 11d86f0 Compare June 30, 2025 22:50
@brettmthompson brettmthompson requested a review from hdefazio July 1, 2025 14:53
Copy link
Copy Markdown

@israel-hdez israel-hdez left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

/lgtm

@openshift-ci
Copy link
Copy Markdown

openshift-ci Bot commented Jul 3, 2025

[APPROVALNOTIFIER] This PR is APPROVED

This pull-request has been approved by: andresllh, brettmthompson, israel-hdez

The full list of commands accepted by this bot can be found here.

The pull request process is described here

Details Needs approval from an approver in each of these files:
  • OWNERS [andresllh,brettmthompson,israel-hdez]

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

@openshift-merge-bot openshift-merge-bot Bot merged commit 3a7402f into opendatahub-io:master Jul 3, 2025
30 of 31 checks passed
@github-project-automation github-project-automation Bot moved this from New/Backlog to Done in ODH Model Serving Planning Jul 3, 2025
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

Status: Done

Development

Successfully merging this pull request may close these issues.

4 participants