250717 sync master by spolti · Pull Request #752 · opendatahub-io/kserve

spolti · 2025-07-17T19:57:51Z

What this PR does / why we need it:

Which issue(s) this PR fixes (optional, in fixes #<issue number>(, fixes #<issue_number>, ...) format, will close the issue(s) when PR gets merged):
Fixes #

Type of changes
Please delete options that are not relevant.

Bug fix (non-breaking change which fixes an issue)
New feature (non-breaking change which adds functionality)
Breaking change (fix or feature that would cause existing functionality to not work as expected)
This change requires a documentation update

Feature/Issue validation/testing:

Please describe the tests that you ran to verify your changes and relevant result summary. Provide instructions so it can be reproduced.
Please also list any relevant details for your test configuration.

Test A
Test B
Logs

Special notes for your reviewer:

Please confirm that if this PR changes any image versions, then that's the sole change this PR makes.

Checklist:

Have you added unit/e2e tests that prove your fix is effective or that this feature works?
Has code been commented, particularly in hard-to-understand areas?
Have you made corresponding changes to the documentation?
Have you linked the JIRA issue(s) to this PR?

Release note:

Re-running failed tests

/rerun-all - rerun all failed workflows.
/rerun-workflow <workflow name> - rerun a specific failed workflow. Only one workflow name can be specified. Multiple /rerun-workflow commands are allowed per comment.

Summary by CodeRabbit

New Features
- Introduced support for LLM (Large Language Model) inference services with new Kubernetes resource types.
- Added configurable storage options for model and logger components, including S3 and HTTP backends.
- Enhanced logging capabilities to store logs in user-specified storage locations and formats.
Improvements
- Added optional status fields to track associated ServingRuntime and ClusterServingRuntime in InferenceService.
- Refined model storage specifications for improved flexibility and schema clarity.
- Improved reconciliation logic to respect auto-update annotations and respond to runtime changes.
- Enhanced agent injection to support logger storage configuration and credential mounting.
Documentation
- Added comprehensive documentation for LLM inference services, model storage, and logger storage specifications.
- Updated Python SDK models and docs to include new resource types and fields.
Bug Fixes
- Fixed logger and storage configuration handling in agent injection and credential management.
Tests
- Expanded test coverage for LLM resources, logger storage, and auto-update reconciliation behavior.

…and CRDs (kserve#4522) Signed-off-by: Pierangelo Di Pilato <pierdipi@redhat.com> Co-authored-by: Jooho Lee <jlee@redhat.com>

Signed-off-by: Hannah DeFazio <h2defazio@gmail.com> Co-authored-by: Jooho Lee <jlee@redhat.com>

Signed-off-by: Spolti <fspolti@redhat.com> Signed-off-by: Andres Llausas <allausas@redhat.com> Signed-off-by: Edgar Hernández <23639005+israel-hdez@users.noreply.github.com> Signed-off-by: tarilabs <matteo.mortari@gmail.com> Signed-off-by: Sivanantham Chinnaiyan <sivanantham.chinnaiyan@ideas2it.com> Signed-off-by: Brett Thompson <196701379+brettmthompson@users.noreply.github.com> Signed-off-by: Jin Dong <jdong183@bloomberg.net> Signed-off-by: Dan Sun <dsun20@bloomberg.net> Signed-off-by: Abolfazl Shahbazi <12436063+ashahba@users.noreply.github.com> Co-authored-by: Filippe Spolti <fspolti@redhat.com> Co-authored-by: Andres Llausas <allausas@redhat.com> Co-authored-by: Matteo Mortari <matteo.mortari@gmail.com> Co-authored-by: Dan Sun <dsun20@bloomberg.net> Co-authored-by: Sivanantham <90966311+sivanantha321@users.noreply.github.com> Co-authored-by: Brett Thompson <196701379+brettmthompson@users.noreply.github.com> Co-authored-by: Jin Dong <greenmoon55@users.noreply.github.com> Co-authored-by: Abolfazl Shahbazi <12436063+ashahba@users.noreply.github.com> Co-authored-by: Jooho Lee <jlee@redhat.com>

Signed-off-by: cory-johannsen <cjohannsen@cloudera.com> Signed-off-by: cjohannsen-cloudera <cjohannsen@cloudera.com> Co-authored-by: Sivanantham <90966311+sivanantha321@users.noreply.github.com>

…ailable in v1.x) (kserve#4540) Signed-off-by: YehCC52 <hometown456@gmail.com>

Signed-off-by: Andres Llausas <allausas@redhat.com> Co-authored-by: Dan Sun <dsun20@bloomberg.net>

Signed-off-by: Hannah DeFazio <h2defazio@gmail.com>

…erve#4556) Signed-off-by: Vincent Hou <shou73@bloomberg.net>

This batch includes the following commits: 178235a fix: unset clampMax and clampMin, since they are not for replicas (kserve#4556) fba4113 Stop and resume an explainer (kserve#4546) a070e2c Auto-update annotation for isvc. (kserve#4342) 2f338a6 Fix outdated BentoML import in sample code (BentoService no longer available in v1.x) (kserve#4540) 9e55d59 feat: Store Inference logging to blob storage (kserve#4473) b1bae1b Allow OCI for multi-node/multi-gpu (kserve#4441) ce43040 Stop and Resume a transformer (kserve#4534) 40806a7 [API] Define LLMInferenceService and LLMInferenceServiceConfig types and CRDs (kserve#4522)

coderabbitai · 2025-07-17T19:58:03Z

Walkthrough

This update introduces new Kubernetes CustomResourceDefinitions (CRDs) and API types for managing LLMInferenceService and LLMInferenceServiceConfig resources, including their specifications, status, and list types. It also adds Python SDK models and documentation for these resources, and updates OpenAPI schemas to support the new LLM inference service functionality.

Changes

Files / Paths	Change Summary
charts/kserve-crd-minimal/templates/serving.kserve.io_llminferenceserviceconfigs.yaml, config/crd/minimal/serving.kserve.io_llminferenceserviceconfigs.yaml	Add new CRD for `LLMInferenceServiceConfig` (v1alpha1, namespaced)
charts/kserve-crd-minimal/templates/serving.kserve.io_llminferenceservices.yaml, config/crd/minimal/serving.kserve.io_llminferenceservices.yaml	Add new CRD for `LLMInferenceService` (v1alpha1, namespaced)
pkg/apis/serving/v1alpha1/llm_inference_service_types.go	Add Go API types for LLMInferenceService and LLMInferenceServiceConfig, including specs/status
pkg/apis/serving/v1alpha1/v1alpha1.go, pkg/apis/serving/v1alpha1/zz_generated.deepcopy.go	Register and add deepcopy methods for new LLMInferenceService types
python/kserve/kserve/models/v1alpha1_llm_inference_service.py, python/kserve/kserve/models/v1alpha1_llm_inference_service_config.py, python/kserve/kserve/models/v1alpha1_llm_inference_service_list.py, python/kserve/kserve/models/v1alpha1_llm_inference_service_config_list.py	Add Python SDK models for LLMInferenceService and LLMInferenceServiceConfig (and list types)
python/kserve/docs/V1alpha1LLMInferenceService.md, python/kserve/docs/V1alpha1LLMInferenceServiceConfig.md, python/kserve/docs/V1alpha1LLMInferenceServiceList.md, python/kserve/docs/V1alpha1LLMInferenceServiceConfigList.md	Add documentation for new LLMInferenceService/Config types in Python SDK
pkg/openapi/openapi_generated.go, pkg/openapi/swagger.json	Add OpenAPI schema definitions for LLMInferenceService and LLMInferenceServiceConfig
python/kserve/kserve/models/init.py	Export new LLMInferenceService and related models in Python SDK

Sequence Diagram(s)

sequenceDiagram
    participant User
    participant Kubernetes API Server
    participant Controller
    participant LLMInferenceService CRD

    User->>Kubernetes API Server: Create LLMInferenceService resource
    Kubernetes API Server->>LLMInferenceService CRD: Store resource
    Kubernetes API Server->>Controller: Notify of new LLMInferenceService
    Controller->>Kubernetes API Server: Read LLMInferenceService spec
    Controller->>Kubernetes API Server: Create/manage Deployments, Services, etc.
    Controller-->>Kubernetes API Server: Update LLMInferenceService status

Estimated code review effort

3 (~45 minutes) — The changes span multiple core API definitions, CRDs, deep copy methods, Python SDK models, and documentation, requiring careful review of schema correctness and consistency.

Possibly related PRs

feat(webhook): introduces validation for configs #723: Adds a validating webhook for LLMInferenceServiceConfig, directly related to the new CRD and types introduced here.
feat(validation): cross-reference field checks #732: Adds a validating webhook for LLMInferenceService, enforcing constraints on the resource introduced in this PR.
fix(scheduler): creates missing endpoint picker service #729: Extends LLMInferenceService controller scheduler reconciliation logic, related to the LLMInferenceService API types added here.

Suggested labels

lgtm

Suggested reviewers

pierDipi
andresllh

Poem

In fields of YAML, CRDs now bloom,
LLMs march in Kubernetes' room.
With specs and configs, new types abound,
Python and Go in harmony found.
Docs and schemas, all aligned—
The future of inference, well-defined!
🐇✨

Warning

There were issues while running some tools. Please review the errors and either fix the tool's configuration or disable the tool if it's a critical failure.

🔧 ast-grep (0.38.6)

pkg/controller/v1beta1/inferenceservice/rawkube_controller_test.go

✨ Finishing Touches

📝 Generate Docstrings

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

🪧 Tips

Chat

There are 3 ways to chat with CodeRabbit:

Review comments: Directly reply to a review comment made by CodeRabbit. Example:
- I pushed a fix in commit <commit_id>, please review it.
- Explain this complex logic.
- Open a follow-up GitHub issue for this discussion.
Files and specific lines of code (under the "Files changed" tab): Tag @coderabbitai in a new review comment at the desired location with your query. Examples:
- @coderabbitai explain this code block.
- @coderabbitai modularize this function.
PR comments: Tag @coderabbitai in a new PR comment to ask questions about the PR branch. For the best results, please provide a very specific query, as very limited context is provided in this mode. Examples:
- @coderabbitai gather interesting stats about this repository and render them as a table. Additionally, render a pie chart showing the language distribution in the codebase.
- @coderabbitai read src/utils.ts and explain its main purpose.
- @coderabbitai read the files in the src/scheduler package and generate a class diagram using mermaid and a README in the markdown format.
- @coderabbitai help me debug CodeRabbit configuration file.

Support

Need help? Create a ticket on our support page for assistance with any issues or questions.

Note: Be mindful of the bot's finite context window. It's strongly recommended to break down tasks such as reading entire modules into smaller chunks. For a focused discussion, use review comments to chat about specific files and their changes, instead of using the PR comments.

CodeRabbit Commands (Invoked using PR comments)

@coderabbitai pause to pause the reviews on a PR.
@coderabbitai resume to resume the paused reviews.
@coderabbitai review to trigger an incremental review. This is useful when automatic reviews are disabled for the repository.
@coderabbitai full review to do a full review from scratch and review all the files again.
@coderabbitai summary to regenerate the summary of the PR.
@coderabbitai generate docstrings to generate docstrings for this PR.
@coderabbitai generate sequence diagram to generate a sequence diagram of the changes in this PR.
@coderabbitai resolve resolve all the CodeRabbit review comments.
@coderabbitai configuration to show the current CodeRabbit configuration for the repository.
@coderabbitai help to get help.

Other keywords and placeholders

Add @coderabbitai ignore anywhere in the PR description to prevent this PR from being reviewed.
Add @coderabbitai summary to generate the high-level summary at a specific location in the PR description.
Add @coderabbitai anywhere in the PR title to generate the title automatically.

CodeRabbit Configuration File (`.coderabbit.yaml`)

You can programmatically configure CodeRabbit by adding a .coderabbit.yaml file to the root of your repository.
Please see the configuration documentation for more information.
If your editor has YAML language server enabled, you can add the path at the top of this file to enable auto-completion and validation: # yaml-language-server: $schema=https://coderabbit.ai/integrations/schema.v2.json

Documentation and Community

Visit our Documentation for detailed information on how to use CodeRabbit.
Join our Discord Community to get help, request features, and share feedback.
Follow us on X/Twitter for updates and announcements.

coderabbitai

Actionable comments posted: 10

🔭 Outside diff range comments (3)

pkg/client/clientset/versioned/fake/clientset_generated.go (1)
54-66: Fix the typo in variable name.

The enhancement to extract and forward ListOptions from watch actions is a good improvement for more precise watch behavior in tests. However, there's a typo in the variable name.
-		if watchActcion, ok := action.(testing.WatchActionImpl); ok {
+		if watchAction, ok := action.(testing.WatchActionImpl); ok {
-			opts = watchActcion.ListOptions
+			opts = watchAction.ListOptions
config/crd/full/serving.kserve.io_clusterservingruntimes.yaml (2)
2717-2779: Duplicate schema error – same misplacement in workerSpec.containers

The identical stopSignal insertion under lifecycle recurs here. After updating the main containers definition, replicate the correction for workerSpec.containers to avoid schema drift.
-                              preStop: …
-                              stopSignal:
-                                type: string
-                            type: object
+                              preStop: …
+                            type: object
+                          stopSignal:
+                            type: string
Keep the two container schemas consistent.

736-798: Realign stopSignal outside of lifecycle in the CRD

The stopSignal property is currently nested under lifecycle.properties (lines ~796–798), but according to upstream Kubernetes (v1.29+), it must be a top-level field of the Container object, alongside lifecycle, not inside it.

Please update config/crd/full/serving.kserve.io_clusterservingruntimes.yaml as follows:

• File: config/crd/full/serving.kserve.io_clusterservingruntimes.yaml
Affected block around lines 674–798.

Proposed diff:
   …
   properties:
     imagePullPolicy:
       type: string
-    lifecycle:
-      properties:
-        postStart: …
-        preStop: …
-        stopSignal:
-          type: string
-      type: object
+    lifecycle:
+      properties:
+        postStart: …
+        preStop: …
+      type: object
+    stopSignal:
+      type: string
   …
• Afterwards, verify the controller’s pod-spec construction to ensure it copies stopSignal from the CRD into the generated corev1.Container.
Failing to move this now will cause valid Kubernetes manifests to be rejected and users’ stopSignal settings to be dropped at runtime.

♻️ Duplicate comments (40)

pkg/controller/v1beta1/inferenceservice/rawkube_controller_test.go (2)

1146-1148: Consistent status field additions.

Same pattern as previously reviewed - correctly adds the new status fields.

4828-4830: Consistent status field additions.

Same pattern as previously reviewed.

config/crd/full/serving.kserve.io_inferenceservices.yaml (23)

1601-1603: Same stopSignal improvement applies here

Please replicate the description/validation addition suggested in the previous comment.

2357-2359: Same stopSignal improvement applies here

4772-4774: Same stopSignal improvement applies here

5518-5520: Same stopSignal improvement applies here

6217-6219: Same stopSignal improvement applies here

6902-6904: Same stopSignal improvement applies here

7374-7386: Replicate logger.storage validation

See first logger.storage comment — the same required-field/description feedback applies here.

7629-7631: Same stopSignal improvement applies here

8336-8338: Same stopSignal improvement applies here

9038-9040: Same stopSignal improvement applies here

9727-9729: Same stopSignal improvement applies here

10423-10425: Same stopSignal improvement applies here

11291-11293: Same stopSignal improvement applies here

11982-11984: Same stopSignal improvement applies here

12751-12753: Same stopSignal improvement applies here

14668-14670: Same stopSignal improvement applies here

15377-15379: Same stopSignal improvement applies here

16094-16096: Same stopSignal improvement applies here

17832-17834: Same stopSignal improvement applies here

19144-19146: Same stopSignal improvement applies here

19900-19902: Same stopSignal improvement applies here

20368-20380: Replicate logger.storage validation

See earlier logger.storage comment — same feedback applies.

21628-21630: Do the same for servingRuntimeName

Add a printColumn and description, mirroring the recommendation for clusterServingRuntimeName.

pkg/openapi/openapi_generated.go (14)

3397-3397: Consistent init containers documentation across schemas.

3887-3887: Consistent init containers documentation maintained.

5782-5782: ModelStorageSpec migration continues consistently.

Also applies to: 5789-5789

6541-6541: Storage spec migration pattern maintained.

Also applies to: 6548-6548

7215-7215: Storage spec migration continues.

Also applies to: 7223-7223

7658-7658: Consistent storage spec update.

Also applies to: 7665-7665

8061-8061: Storage spec migration maintained.

Also applies to: 8068-8068

8385-8385: Consistent ModelStorageSpec reference.

Also applies to: 8392-8392

9251-9251: Storage spec update applied.

Also applies to: 9258-9258

10281-10281: ModelStorageSpec migration continues.

Also applies to: 10288-10288

10363-10363: Key field documentation clarified.

10688-10688: Storage spec migration applied.

Also applies to: 10695-10695

11013-11013: Consistent storage spec update.

Also applies to: 11020-11020

11905-11905: ModelStorageSpec reference updated.

Also applies to: 11912-11912

pkg/apis/serving/v1alpha1/zz_generated.deepcopy.go (1)

742-751: Deepcopy complexity due to pointer to map.

The deepcopy implementation for Parameters field requires nested nil checks due to the pointer to map pattern. This reinforces the suggestion to simplify the field to just map[string]string.

🧹 Nitpick comments (29)

python/kserve/docs/V1beta1CustomTransformer.md (1)

22-22: Docs wording still quite verbose
LanguageTool flags “prior to”, “taken into account”, and “in a similar fashion” as wordy. If you ever touch the generator, consider trimming those to “before”, “considered”, and “similarly” for brevity.

python/kserve/kserve/models/v1beta1_custom_explainer.py (2)

610-611: Same verbose phrasing carried over from generator
The docstring still uses the wordy trio (“prior to”, “taken into account”, “in a similar fashion”). Optional: simplify to “before”, “considered”, “similarly” to align with concise style elsewhere.

621-622: Duplicate comment – applies to setter docstring as well
Setter docstring repeats the exact sentence; same wording suggestion as above. If this file is regenerated, fix in the template to avoid repeating manual edits.
python/kserve/docs/V1beta1ModelStorageSpec.md (1)
4-4: Add blank lines around table for better formatting

The table should be surrounded by blank lines for better readability and to follow markdown best practices.
 ## Properties
+
 Name | Type | Description | Notes
 ------------ | ------------- | ------------- | -------------
 **key** | **str** | The Storage Key in the secret for this object. | [optional] 
 **parameters** | **dict(str, str)** | Parameters to override the default storage credentials and config. | [optional] 
 **path** | **str** | The path to the object in the storage. Note that this path is relative to the storage URI. | [optional] 
 **schema_path** | **str** | The path to the model schema file in the storage. | [optional] 
+
python/kserve/docs/V1alpha1LLMInferenceServiceList.md (1)
1-15: Fix markdown formatting issues.

The documentation content is correct, but there are several formatting issues that should be addressed:

Apply this diff to fix the markdown formatting:
 # V1alpha1LLMInferenceServiceList
 
 LLMInferenceServiceList is the list type for LLMInferenceService.
+
 ## Properties
 Name | Type | Description | Notes
 ------------ | ------------- | ------------- | -------------
-**api_version** | **str** | APIVersion defines the versioned schema of this representation of an object. Servers should convert recognized schemas to the latest internal value, and may reject unrecognized values. More info: https://git.k8s.io/community/contributors/devel/sig-architecture/api-conventions.md#resources | [optional] 
-**items** | [**list[V1alpha1LLMInferenceService]**](V1alpha1LLMInferenceService.md) |  | 
-**kind** | **str** | Kind is a string value representing the REST resource this object represents. Servers may infer this from the endpoint the client submits requests to. Cannot be updated. In CamelCase. More info: https://git.k8s.io/community/contributors/devel/sig-architecture/api-conventions.md#types-kinds | [optional] 
-**metadata** | [**V1ListMeta**](https://github.com/kubernetes-client/python/blob/master/kubernetes/docs/V1ListMeta.md) |  | [optional] 
+**api_version** | **str** | APIVersion defines the versioned schema of this representation of an object. Servers should convert recognized schemas to the latest internal value, and may reject unrecognized values. More info: <https://git.k8s.io/community/contributors/devel/sig-architecture/api-conventions.md#resources> | [optional]
+**items** | [**list[V1alpha1LLMInferenceService]**](V1alpha1LLMInferenceService.md) |  |
+**kind** | **str** | Kind is a string value representing the REST resource this object represents. Servers may infer this from the endpoint the client submits requests to. Cannot be updated. In CamelCase. More info: <https://git.k8s.io/community/contributors/devel/sig-architecture/api-conventions.md#types-kinds> | [optional]
+**metadata** | [**V1ListMeta**](https://github.com/kubernetes-client/python/blob/master/kubernetes/docs/V1ListMeta.md) |  | [optional]
+
 
 [[Back to Model list]](../README.md#documentation-for-models) [[Back to API list]](../README.md#documentation-for-api-endpoints) [[Back to README]](../README.md)
 
-
python/kserve/docs/V1beta1LoggerStorageSpec.md (1)
3-10: Fix table formatting and add missing description.

The documentation has a formatting issue and the service_account_name field lacks a description.

Apply this diff to fix the formatting:
 # V1beta1LoggerStorageSpec
 
+
 ## Properties
 Name | Type | Description | Notes
 ------------ | ------------- | ------------- | -------------
 **key** | **str** | The Storage Key in the secret for this object. | [optional] 
 **parameters** | **dict(str, str)** | Parameters to override the default storage credentials and config. | [optional] 
 **path** | **str** | The path to the object in the storage. Note that this path is relative to the storage URI. | [optional] 
-**service_account_name** | **str** |  | [optional] 
+**service_account_name** | **str** | The name of the service account to use for storage authentication. | [optional]
+
python/kserve/docs/V1beta1StorageSpec.md (1)
5-5: Fix markdown table formatting.

The markdown table should be surrounded by blank lines for proper rendering.
 **path** | **str** | The path to the object in the storage. Note that this path is relative to the storage URI. | [optional] 
+
 Name | Type | Description | Notes
+
 [[Back to Model list]](../README.md#documentation-for-models) [[Back to API list]](../README.md#documentation-for-api-endpoints) [[Back to README]](../README.md)
python/kserve/kserve/models/v1beta1_logger_spec.py (1)
158-177: Storage property implementation follows established patterns.

The getter and setter methods for the storage property are implemented consistently with other properties in this class. However, the docstring is missing a description of what the storage property represents.

Consider adding a description to the docstring similar to other properties:
-    """Gets the storage of this V1beta1LoggerSpec.  # noqa: E501
-
-
-        :return: The storage of this V1beta1LoggerSpec.  # noqa: E501
+    """Gets the storage of this V1beta1LoggerSpec.  # noqa: E501
+
+        Storage configuration for inference logger cloud events.  # noqa: E501
+
+        :return: The storage of this V1beta1LoggerSpec.  # noqa: E501
And similarly for the setter:
-    """Sets the storage of this V1beta1LoggerSpec.
-
-
-        :param storage: The storage of this V1beta1LoggerSpec.  # noqa: E501
+    """Sets the storage of this V1beta1LoggerSpec.
+
+        Storage configuration for inference logger cloud events.  # noqa: E501
+
+        :param storage: The storage of this V1beta1LoggerSpec.  # noqa: E501
python/kserve/docs/V1alpha1LLMInferenceServiceConfigList.md (1)
1-15: Documentation content is accurate and follows standard patterns.

The documentation properly describes the LLMInferenceServiceConfigList type with standard Kubernetes list properties and appropriate links to related documentation.

Address the markdown formatting issues identified by static analysis:
+
 ## Properties
+
 Name | Type | Description | Notes
 ------------ | ------------- | ------------- | -------------
-**api_version** | **str** | APIVersion defines the versioned schema of this representation of an object. Servers should convert recognized schemas to the latest internal value, and may reject unrecognized values. More info: https://git.k8s.io/community/contributors/devel/sig-architecture/api-conventions.md#resources | [optional] 
-**items** | [**list[V1alpha1LLMInferenceServiceConfig]**](V1alpha1LLMInferenceServiceConfig.md) |  | 
-**kind** | **str** | Kind is a string value representing the REST resource this object represents. Servers may infer this from the endpoint the client submits requests to. Cannot be updated. In CamelCase. More info: https://git.k8s.io/community/contributors/devel/sig-architecture/api-conventions.md#types-kinds | [optional] 
-**metadata** | [**V1ListMeta**](https://github.com/kubernetes-client/python/blob/master/kubernetes/docs/V1ListMeta.md) |  | [optional] 
+**api_version** | **str** | APIVersion defines the versioned schema of this representation of an object. Servers should convert recognized schemas to the latest internal value, and may reject unrecognized values. More info: <https://git.k8s.io/community/contributors/devel/sig-architecture/api-conventions.md#resources> | [optional] 
+**items** | [**list[V1alpha1LLMInferenceServiceConfig]**](V1alpha1LLMInferenceServiceConfig.md) |  | 
+**kind** | **str** | Kind is a string value representing the REST resource this object represents. Servers may infer this from the endpoint the client submits requests to. Cannot be updated. In CamelCase. More info: <https://git.k8s.io/community/contributors/devel/sig-architecture/api-conventions.md#types-kinds> | [optional] 
+**metadata** | [**V1ListMeta**](<https://github.com/kubernetes-client/python/blob/master/kubernetes/docs/V1ListMeta.md>) |  | [optional] 
+
+[[Back to Model list]](../README.md#documentation-for-models) [[Back to API list]](../README.md#documentation-for-api-endpoints) [[Back to README]](../README.md)

-[[Back to Model list]](../README.md#documentation-for-models) [[Back to API list]](../README.md#documentation-for-api-endpoints) [[Back to README]](../README.md)
-
This addresses:

Missing blank lines around the table

Bare URLs that should be wrapped in angle brackets
python/kserve/docs/V1alpha1LLMInferenceServiceConfig.md (1)
1-14: LGTM! Clear documentation with minor formatting improvements needed.

The documentation provides a clear description of the LLMInferenceServiceConfig API resource with proper property descriptions and navigation links.

Consider addressing the markdown formatting issues flagged by static analysis:
+
 ## Properties
+
 Name | Type | Description | Notes
Also consider using reference-style links for the long URLs in the table to improve readability:
-**api_version** | **str** | APIVersion defines the versioned schema of this representation of an object. Servers should convert recognized schemas to the latest internal value, and may reject unrecognized values. More info: https://git.k8s.io/community/contributors/devel/sig-architecture/api-conventions.md#resources | [optional] 
+**api_version** | **str** | APIVersion defines the versioned schema of this representation of an object. Servers should convert recognized schemas to the latest internal value, and may reject unrecognized values. More info: [API Conventions][api-conventions] | [optional] 

+[api-conventions]: https://git.k8s.io/community/contributors/devel/sig-architecture/api-conventions.md#resources
pkg/controller/v1beta1/inferenceservice/components/predictor.go (1)

345-352: LGTM - Consistent runtime type handling

The code correctly applies the same mutual exclusivity logic for auto-selected runtimes as used for explicitly specified runtimes. The status fields are properly maintained to reflect the runtime type.

Note: There's a potential minor optimization opportunity here as GetServingRuntime is called twice for the same runtime (once during selection and once here for the boolean), but this doesn't affect correctness.
config/crd/full/serving.kserve.io_servingruntimes.yaml (2)
796-798: Document & validate the newly-added stopSignal field

stopSignal is introduced as a plain string, but its accepted values and purpose are not obvious. Tightening the schema will improve UX and prevent typos.
   stopSignal:
+    description: "POSIX signal (e.g. SIGTERM) that the runtime controller should send to gracefully terminate the container."
+    pattern: "^SIG[A-Z0-9]+$"
     type: string
Adding a description and a simple pattern (or an enum if only a few signals are allowed) keeps the field optional while making its intent self-evident.

2777-2779: Apply the same improvements to the worker container schema

The workerSpec.containers stanza now mirrors the main container spec. Ensure any description / validation you add for the primary stopSignal is duplicated here, and confirm your reconciliation logic actually propagates workerSpec.stopSignal down to the generated Pod template.
go.mod (1)
173-182: Document the rationale for dependency pinning.

The exclude directive and replace directive are used to resolve version conflicts. While the comments explain the immediate reason, consider adding more context about why KEDA requires 0.19 and GIE uses 0.21 of controller-runtime.
 // google.golang.org/grpc/stats/opentelemetry is used by the keda package.
 // To avoid ambiguous import error due to multiple module versions found for the same import path,
 // we force the version in google.golang.org/grpc (v1.71.0).
 // Instead of replacing the path directly, exclude the problematic package
 // This avoids "module used for two different module paths" error
 exclude google.golang.org/grpc/stats/opentelemetry v0.0.0-00010101000000-000000000000

-// Currently, KEDA requires 0.19 but GIE uses 0.21
+// Currently, KEDA v2.16.1 requires controller-runtime 0.19.x for compatibility with its internal APIs,
+// while gateway-api-inference-extension (GIE) v0.3.0 uses 0.21.x. We pin to 0.19.7 to maintain
+// compatibility with KEDA until it supports newer controller-runtime versions.
 replace sigs.k8s.io/controller-runtime => sigs.k8s.io/controller-runtime v0.19.7
pkg/constants/constants.go (1)
125-129: Consider grouping related annotation keys.

The new annotation keys are well-named and follow the existing pattern. However, for better organization, consider grouping all logger-related annotation keys together.
 	PrometheusPathAnnotationKey                 = "prometheus.io/path"
 	StorageReadonlyAnnotationKey                = "storage.kserve.io/readonly"
 	DefaultPrometheusPath                       = "/metrics"
 	QueueProxyAggregatePrometheusMetricsPort    = "9088"
 	DefaultPodPrometheusPort                    = "9091"
 	NodeGroupAnnotationKey                      = KServeAPIGroupName + "/nodegroup"
+	// Logger-related annotations
 	LoggerSecretNameKey                         = KServeAPIGroupName + "/logger-secret-name"
 	LoggerCredentialPathKey                     = KServeAPIGroupName + "/logger-secret-path"
 	LoggerCredentialFileKey                     = KServeAPIGroupName + "/logger-secret-file"
+	// Auto-update control
 	DisableAutoUpdateAnnotationKey              = KServeAPIGroupName + "/disable-auto-update"
pkg/logger/mock.go (1)
26-36: Consider adding buffer to the response channel.

The unbuffered channel could cause blocking if the test doesn't consume messages immediately.
 func NewMockStore(storageSpec *v1beta1.StorageSpec) *MockStore {
 	return &MockStore{
 		StorageSpec:  storageSpec,
-		ResponseChan: make(chan *LogRequest),
+		ResponseChan: make(chan *LogRequest, 100), // Buffer to prevent blocking
 	}
 }
python/kserve/kserve/models/v1beta1_logger_storage_spec.py (2)
157-157: Add missing docstring for service_account_name property.

The getter method for service_account_name is missing its docstring while all other properties have proper documentation.
-
+        The service account name for accessing the storage.  # noqa: E501
63-83: Consider adding type hints for better IDE support.

While this is auto-generated code, adding type hints would improve developer experience.
-    def __init__(self, key=None, parameters=None, path=None, service_account_name=None, local_vars_configuration=None):  # noqa: E501
+    def __init__(self, key: str = None, parameters: dict = None, path: str = None, service_account_name: str = None, local_vars_configuration=None):  # noqa: E501
pkg/controller/v1beta1/inferenceservice/controller.go (1)
530-530: Remove or implement the TODO comment.

The TODO comment suggests finding a way to distinguish ServingRuntime types, but the current implementation already handles them separately with different handlers.
-	// TODO: Find a way to distinguish if the ServingRuntime is a ClusterServingRuntime or not
pkg/controller/v1beta1/inferenceservice/rawkube_controller_test.go (1)
3776-4484: Consider refactoring repetitive test setup code.

The test cases contain significant duplication in ServingRuntime and InferenceService setup. Consider extracting common setup into helper functions to improve maintainability.

Example helper functions:
func createTestServingRuntime(name, namespace, framework string) *v1alpha1.ServingRuntime {
    return &v1alpha1.ServingRuntime{
        ObjectMeta: metav1.ObjectMeta{
            Name:      name,
            Namespace: namespace,
        },
        Spec: v1alpha1.ServingRuntimeSpec{
            SupportedModelFormats: []v1alpha1.SupportedModelFormat{
                {
                    Name:       framework,
                    Version:    proto.String("1"),
                    AutoSelect: proto.Bool(true),
                },
            },
            ServingRuntimePodSpec: v1alpha1.ServingRuntimePodSpec{
                Labels: map[string]string{
                    "key1": "val1FromSR",
                    "key2": "val2FromSR",
                    "key3": "val3FromSR",
                },
                // ... rest of the spec
            },
            Disabled: proto.Bool(false),
        },
    }
}

func createTestInferenceService(name, namespace string, autoUpdateValue string) *v1beta1.InferenceService {
    // Similar helper for InferenceService creation
}
config/crd/full/serving.kserve.io_inferenceservices.yaml (3)
738-740: Add description & regex validation for new stopSignal field

Only the type is declared. Adding a short description plus a conservative regex such as '^SIG[A-Z]+$' (or an enum with the signals you support) will self-document intent and prevent clearly invalid input at admission time.
 stopSignal:
+  description: Signal (e.g. "SIGTERM") used to gracefully terminate the model server.
+  pattern: '^SIG[A-Z]+$'
   type: string
2825-2837: logger.storage schema is underspecified — clarify required fields

At least one of key or path is presumably mandatory for a logger target. Consider adding a required: ["key"] (or ["path"]) array, plus descriptions for each property to make the CRD self-describing. Without this, invalid empty objects will pass validation and fail later at runtime.

21480-21482: Expose clusterServingRuntimeName via printColumn and add description

Adding
x-kubernetes-print-column:
  - name: ClusterRuntime
    type: string
    jsonPath: .status.clusterServingRuntimeName
makes this useful in kubectl get listings. Also document the field with a description to clarify its intent.
pkg/webhook/admission/pod/agent_injector.go (1)
209-223: Consider refactoring for improved readability.

The nested nil checks are correct but could be simplified for better readability.

Consider this refactor:
-		storagePath := ""
-		if ag.loggerConfig.Store != nil {
-			if ag.loggerConfig.Store.Path != nil {
-				storagePath = *ag.loggerConfig.Store.Path
-			}
-		}
-		storageFormat := ""
-		if ag.loggerConfig.Store != nil {
-			if ag.loggerConfig.Store.Parameters != nil {
-				format, ok := (*ag.loggerConfig.Store.Parameters)[constants.LoggerFormatKey]
-				if ok {
-					storageFormat = format
-				}
-			}
-		}
+		storagePath := ""
+		storageFormat := ""
+		if ag.loggerConfig.Store != nil {
+			if ag.loggerConfig.Store.Path != nil {
+				storagePath = *ag.loggerConfig.Store.Path
+			}
+			if ag.loggerConfig.Store.Parameters != nil {
+				if format, ok := (*ag.loggerConfig.Store.Parameters)[constants.LoggerFormatKey]; ok {
+					storageFormat = format
+				}
+			}
+		}
pkg/openapi/swagger.json (2)

3306-3310: Potential naming ambiguity between clusterServingRuntimeName and servingRuntimeName

Both fields are plain strings with no validation hints (enum/regex). Because their semantic difference is subtle (cluster-scoped vs namespaced runtime), it will be easy for users and tooling to mix them up.

Consider:

adding x-kubernetes-validations (regex) to enforce qualified names (e.g. <ns>/<name> vs just <name>), or

embedding them in a single object (runtimeRef { name, namespace?, clusterScoped? }).

This will avoid divergent conventions across clients and SDKs.

Also applies to: 3342-3345

3681-3714: Minor: serviceAccountName in LoggerStorageSpec lacks description

Every other field has a helpful description. Add one for consistency (e.g. “K8s ServiceAccount to access the backing object store when different from the pod’s default SA”).
pkg/logger/store.go (2)
52-69: Consider case-insensitive format comparison.

The marshaller implementation is clean. However, consider making the format comparison case-insensitive for better usability.
 func getMarshaller(format string) (Marshaller, error) {
-	switch format {
+	switch strings.ToLower(format) {
 	case "json":
 		return &JSONMarshaller{}, nil
 	default:
 		return nil, fmt.Errorf("unsupported format %s", format)
 	}
 }
119-153: Enhance success logging with context.

The upload logic is solid, but the success log could be more informative.
-	s.log.Info("Successfully uploaded object to S3")
+	s.log.Infof("Successfully uploaded log to S3: bucket=%s, key=%s", bucket, objectKey)
pkg/apis/serving/v1alpha1/llm_inference_service_types.go (1)
149-210: Consider documenting the design asymmetry.

The RouterSpec design where HTTPRoute can be either managed (via Spec) or referenced (via Refs), while Gateway and Ingress only support references, is architecturally sound. However, consider adding a comment explaining this design choice for clarity.

Add a comment before GatewaySpec:
// GatewaySpec defines the configuration for a Gateway API Gateway.
// Unlike HTTPRouteSpec, Gateway resources are always user-managed (never created by the controller)
// to avoid conflicts with cluster-wide Gateway controllers.
type GatewaySpec struct {

coderabbitai

Actionable comments posted: 7

♻️ Duplicate comments (18)

config/crd/full/serving.kserve.io_inferenceservices.yaml (18)

1601-1602: Same observation as above – please add a description and (optionally) a simple validation.

2357-2358: Same observation as above – please add a description and (optionally) a simple validation.

7374-7386: Same logger.storage remarks as above – descriptions & required list are missing.

7629-7630: Same stopSignal remarks as above – please add description & validation.

8336-8337: Same stopSignal remarks as above – please add description & validation.

9038-9039: Same stopSignal remarks as above – please add description & validation.

9727-9728: Same stopSignal remarks as above – please add description & validation.

10423-10424: Same stopSignal remarks as above – please add description & validation.

11291-11292: Same stopSignal remarks as above – please add description & validation.

11982-11983: Same stopSignal remarks as above – please add description & validation.

12751-12752: Same stopSignal remarks as above – please add description & validation.

14668-14669: Same stopSignal remarks as above – please add description & validation.

15377-15378: Same stopSignal remarks as above – please add description & validation.

16094-16095: Same stopSignal remarks as above – please add description & validation.

17832-17833: Same stopSignal remarks as above – please add description & validation.

19144-19145: Same stopSignal remarks as above – please add description & validation.

19900-19901: Same stopSignal remarks as above – please add description & validation.

20368-20380: Same logger.storage remarks as above – descriptions & required list are missing.

🧹 Nitpick comments (26)

python/kserve/docs/V1beta1PredictorExtensionSpec.md (1)
27-27: Add a short description for the new storage field

The type reference was updated correctly, but the description column was left blank. A one–liner helps users understand why the new ModelStorageSpec matters.
-**storage** | [**V1beta1ModelStorageSpec**](V1beta1ModelStorageSpec.md) |  | [optional] 
+**storage** | [**V1beta1ModelStorageSpec**](V1beta1ModelStorageSpec.md) | Configuration for locating the model artifact (URI, access mode, credentials, etc.). | [optional] 
python/kserve/docs/V1beta1ARTExplainerSpec.md (1)
27-27: Fill in the missing description for storage

Replicating the blank description leaves readers guessing. Consider adding a concise explanation, e.g.:
-**storage** | [**V1beta1ModelStorageSpec**](V1beta1ModelStorageSpec.md) |  | [optional] 
+**storage** | [**V1beta1ModelStorageSpec**](V1beta1ModelStorageSpec.md) | Where the explainer’s model artifacts are stored and how to retrieve them. | [optional] 
python/kserve/docs/V1beta1ONNXRuntimeSpec.md (1)
27-27: Clarify the purpose of storage

Same issue as the other specs: the cell is empty.
-**storage** | [**V1beta1ModelStorageSpec**](V1beta1ModelStorageSpec.md) |  | [optional] 
+**storage** | [**V1beta1ModelStorageSpec**](V1beta1ModelStorageSpec.md) | Detailed spec for downloading the ONNX model (URI, schema path, auth). | [optional] 
python/kserve/docs/V1beta1PaddleServerSpec.md (1)
26-26: Populate the empty description column

Consistency with the other spec tables improves usability.
-**storage** | [**V1beta1ModelStorageSpec**](V1beta1ModelStorageSpec.md) |  | [optional] 
+**storage** | [**V1beta1ModelStorageSpec**](V1beta1ModelStorageSpec.md) | Location and credentials for Paddle model artifacts. | [optional] 
pkg/agent/storage/gcs.go (1)

64-66: Placeholder implementation for GCS upload functionality.

The method correctly implements the new UploadObject interface requirement with an appropriate error message indicating the functionality is not yet implemented.

Consider creating a tracking issue for implementing GCS upload functionality to complete the storage provider interface.
python/kserve/docs/V1beta1LoggerSpec.md (1)
10-10: Fix markdown table formatting issues.

The addition of the storage property is correct and aligns with the logger storage enhancement. However, the markdown table has formatting issues that need to be addressed.

Apply this fix to correct the table formatting:
-**storage** | [**V1beta1LoggerStorageSpec**](V1beta1LoggerStorageSpec.md) |  | [optional] 
+**storage** | [**V1beta1LoggerStorageSpec**](V1beta1LoggerStorageSpec.md) |  | [optional]
The table should have consistent column count and no trailing pipes.
pkg/client/clientset/versioned/fake/clientset_generated.go (1)
55-61: Good enhancement with a typo that needs fixing.

The enhancement to extract and pass ListOptions to the watch reactor is a valuable improvement that enables proper filtering and option handling in the fake clientset. However, there's a typo on line 56.

Fix the typo in the variable name:
-	if watchActcion, ok := action.(testing.WatchActionImpl); ok {
+	if watchAction, ok := action.(testing.WatchActionImpl); ok {
python/kserve/docs/V1alpha1LLMInferenceService.md (1)
1-16: Fix markdown formatting issues while maintaining good content structure.

The documentation content is well-structured and appropriately describes the LLM inference service resource. However, there are some markdown formatting issues that should be addressed:
+
 Name | Type | Description | Notes
 ------------ | ------------- | ------------- | -------------
-**api_version** | **str** | APIVersion defines the versioned schema of this representation of an object. Servers should convert recognized schemas to the latest internal value, and may reject unrecognized values. More info: https://git.k8s.io/community/contributors/devel/sig-architecture/api-conventions.md#resources | [optional] 
-**kind** | **str** | Kind is a string value representing the REST resource this object represents. Servers may infer this from the endpoint the client submits requests to. Cannot be updated. In CamelCase. More info: https://git.k8s.io/community/contributors/devel/sig-architecture/api-conventions.md#types-kinds | [optional] 
+**api_version** | **str** | APIVersion defines the versioned schema of this representation of an object. Servers should convert recognized schemas to the latest internal value, and may reject unrecognized values. More info: <https://git.k8s.io/community/contributors/devel/sig-architecture/api-conventions.md#resources> | [optional] 
+**kind** | **str** | Kind is a string value representing the REST resource this object represents. Servers may infer this from the endpoint the client submits requests to. Cannot be updated. In CamelCase. More info: <https://git.k8s.io/community/contributors/devel/sig-architecture/api-conventions.md#types-kinds> | [optional] 
 **metadata** | [**V1ObjectMeta**](https://github.com/kubernetes-client/python/blob/master/kubernetes/docs/V1ObjectMeta.md) |  | [optional] 
 **spec** | [**V1alpha1LLMInferenceServiceSpec**](V1alpha1LLMInferenceServiceSpec.md) |  | [optional] 
 **status** | [**V1alpha1LLMInferenceServiceStatus**](V1alpha1LLMInferenceServiceStatus.md) |  | [optional] 
+
python/kserve/docs/V1beta1ModelStorageSpec.md (1)
1-14: Fix markdown table formatting for consistency.

The documentation content accurately describes the ModelStorageSpec properties. However, there's a minor formatting issue with the table spacing.
 # V1beta1ModelStorageSpec

+
 ## Properties
 Name | Type | Description | Notes
 ------------ | ------------- | ------------- | -------------
 **key** | **str** | The Storage Key in the secret for this object. | [optional] 
 **parameters** | **dict(str, str)** | Parameters to override the default storage credentials and config. | [optional] 
 **path** | **str** | The path to the object in the storage. Note that this path is relative to the storage URI. | [optional] 
 **schema_path** | **str** | The path to the model schema file in the storage. | [optional] 
+
python/kserve/docs/V1alpha1LLMInferenceServiceList.md (1)
4-10: Fix table formatting for better markdown rendering.

The table should be surrounded by blank lines to comply with markdown standards and ensure proper rendering.
 # V1alpha1LLMInferenceServiceList

 LLMInferenceServiceList is the list type for LLMInferenceService.
+
 ## Properties
 Name | Type | Description | Notes
 ------------ | ------------- | ------------- | -------------
 **api_version** | **str** | APIVersion defines the versioned schema of this representation of an object. Servers should convert recognized schemas to the latest internal value, and may reject unrecognized values. More info: https://git.k8s.io/community/contributors/devel/sig-architecture/api-conventions.md#resources | [optional] 
 **items** | [**list[V1alpha1LLMInferenceService]**](V1alpha1LLMInferenceService.md) |  | 
 **kind** | **str** | Kind is a string value representing the REST resource this object represents. Servers may infer this from the endpoint the client submits requests to. Cannot be updated. In CamelCase. More info: https://git.k8s.io/community/contributors/devel/sig-architecture/api-conventions.md#types-kinds | [optional] 
 **metadata** | [**V1ListMeta**](https://github.com/kubernetes-client/python/blob/master/kubernetes/docs/V1ListMeta.md) |  | [optional] 
+
python/kserve/docs/V1beta1LoggerStorageSpec.md (1)
3-10: Fix table formatting issues for proper markdown rendering.

The table has multiple formatting problems that need to be addressed:

Missing blank lines around the table

Trailing pipe without proper column data on line 10

Missing description for the service_account_name field
 # V1beta1LoggerStorageSpec

+
 ## Properties
 Name | Type | Description | Notes
 ------------ | ------------- | ------------- | -------------
 **key** | **str** | The Storage Key in the secret for this object. | [optional] 
 **parameters** | **dict(str, str)** | Parameters to override the default storage credentials and config. | [optional] 
 **path** | **str** | The path to the object in the storage. Note that this path is relative to the storage URI. | [optional] 
-**service_account_name** | **str** |  | [optional] 
+**service_account_name** | **str** | The service account name for storage access. | [optional]
+
python/kserve/docs/V1alpha1LLMInferenceServiceConfig.md (1)
4-10: Add blank lines around the table for better markdown compliance.

The table should be surrounded by blank lines according to markdown best practices.
 LLMInferenceServiceConfig is the Schema for the llminferenceserviceconfigs API. It acts as a template to provide base configurations that can be inherited by multiple LLMInferenceService instances.
+
 ## Properties
 Name | Type | Description | Notes
 ------------ | ------------- | ------------- | -------------
 **api_version** | **str** | APIVersion defines the versioned schema of this representation of an object. Servers should convert recognized schemas to the latest internal value, and may reject unrecognized values. More info: https://git.k8s.io/community/contributors/devel/sig-architecture/api-conventions.md#resources | [optional] 
 **kind** | **str** | Kind is a string value representing the REST resource this object represents. Servers may infer this from the endpoint the client submits requests to. Cannot be updated. In CamelCase. More info: https://git.k8s.io/community/contributors/devel/sig-architecture/api-conventions.md#types-kinds | [optional] 
 **metadata** | [**V1ObjectMeta**](https://github.com/kubernetes-client/python/blob/master/kubernetes/docs/V1ObjectMeta.md) |  | [optional] 
 **spec** | [**V1alpha1LLMInferenceServiceSpec**](V1alpha1LLMInferenceServiceSpec.md) |  | [optional] 
+
python/kserve/docs/V1beta1StorageSpec.md (1)
4-10: Add blank lines around the table for better markdown compliance.

The table should be surrounded by blank lines according to markdown best practices.
 StorageSpec defines a spec for an object in an object store
+
 ## Properties
 Name | Type | Description | Notes
 ------------ | ------------- | ------------- | -------------
 **key** | **str** | The Storage Key in the secret for this object. | [optional] 
 **parameters** | **dict(str, str)** | Parameters to override the default storage credentials and config. | [optional] 
 **path** | **str** | The path to the object in the storage. Note that this path is relative to the storage URI. | [optional] 
+
python/kserve/kserve/models/v1beta1_logger_storage_spec.py (1)
153-172: Missing detailed documentation for service_account_name property.

The service_account_name property getter and setter lack detailed docstrings compared to other properties (like key, parameters, path). For consistency and better API documentation, consider adding a description similar to the other properties.
 @property
 def service_account_name(self):
     """Gets the service_account_name of this V1beta1LoggerStorageSpec.  # noqa: E501

+    The name of the service account to use for accessing the storage.  # noqa: E501

     :return: The service_account_name of this V1beta1LoggerStorageSpec.  # noqa: E501
     :rtype: str
     """
     return self._service_account_name

 @service_account_name.setter
 def service_account_name(self, service_account_name):
     """Sets the service_account_name of this V1beta1LoggerStorageSpec.

+    The name of the service account to use for accessing the storage.  # noqa: E501

     :param service_account_name: The service_account_name of this V1beta1LoggerStorageSpec.  # noqa: E501
     :type: str
     """
config/crd/full/serving.kserve.io_inferenceservices.yaml (4)
738-739: Add description & basic validation for the new stopSignal field

The UI/CLI will surface this field without any context. Add at least a description and (optionally) an enum/x-kubernetes-validations rule so users know what values such as SIGTERM, SIGKILL, etc. are accepted.
 stopSignal:
+  description: Signal that the control-plane sends to gracefully stop the
+    container (e.g. "SIGTERM"). If omitted the default Docker stop signal
+    is used.
   type: string
+# x-kubernetes-validations:
+# - rule: self =~ '^SIG[A-Z]+$'
+#   message: stopSignal must be a POSIX signal name, e.g. SIGTERM
2825-2837: logger.storage object lacks descriptions & required markers

The new storage block is self-explanatory only for maintainers. Add description fields and declare which keys are required (at minimum key or path looks mandatory).
 storage:
   properties:
     key:
+      description: Identifier of the storage backend (e.g. "s3", "gcs").
       type: string
     parameters:
       additionalProperties:
         type: string
       type: object
+      description: Arbitrary backend-specific configuration parameters.
     path:
+      description: Absolute path or URI prefix inside the backend where
+        request/response logs are written.
       type: string
     serviceAccountName:
+      description: Service account used by the logger sidecar to access
+        the storage backend.
       type: string
   type: object
+# required:
+#   - key
21480-21481: Add description for clusterServingRuntimeName

Without a description it shows up as an opaque string in generated docs/CRD explain output.
 clusterServingRuntimeName:
+  description: Name of the cluster-level ServingRuntime resource backing
+    this InferenceService, when applicable.
   type: string
21628-21629: Add description for servingRuntimeName

Same documentation concern as above.
 servingRuntimeName:
+  description: Name of the namespace-scoped ServingRuntime resource that
+    instantiated this InferenceService.
   type: string
pkg/logger/mock.go (1)
26-43: Consider using a buffered channel to prevent blocking

The ResponseChan is unbuffered, which means the Store method will block if there's no goroutine ready to receive from the channel. This could cause issues in tests if the receiver isn't set up before calling Store.
 func NewMockStore(storageSpec *v1beta1.StorageSpec) *MockStore {
 	return &MockStore{
 		StorageSpec:  storageSpec,
-		ResponseChan: make(chan *LogRequest),
+		ResponseChan: make(chan *LogRequest, 1),
 	}
 }
pkg/webhook/admission/pod/agent_injector.go (1)
209-223: Simplify redundant nil checks

The code has redundant checks for ag.loggerConfig.Store. Consider combining them for better readability.
-		storagePath := ""
-		if ag.loggerConfig.Store != nil {
-			if ag.loggerConfig.Store.Path != nil {
-				storagePath = *ag.loggerConfig.Store.Path
-			}
-		}
-		storageFormat := ""
-		if ag.loggerConfig.Store != nil {
-			if ag.loggerConfig.Store.Parameters != nil {
-				format, ok := (*ag.loggerConfig.Store.Parameters)[constants.LoggerFormatKey]
-				if ok {
-					storageFormat = format
-				}
-			}
-		}
+		storagePath := ""
+		storageFormat := ""
+		if ag.loggerConfig.Store != nil {
+			if ag.loggerConfig.Store.Path != nil {
+				storagePath = *ag.loggerConfig.Store.Path
+			}
+			if ag.loggerConfig.Store.Parameters != nil {
+				if format, ok := (*ag.loggerConfig.Store.Parameters)[constants.LoggerFormatKey]; ok {
+					storageFormat = format
+				}
+			}
+		}
pkg/openapi/swagger.json (2)
3306-3309: New runtime-name fields: add omitempty & RBAC-safe exposure

The newly introduced clusterServingRuntimeName and servingRuntimeName fields are optional, but the schema lacks x-kubernetes-list-type/nullable/omitempty hints.
Without omitempty, server-side apply can introduce superfluous empty strings, cluttering status patches.

Likewise, exposing cluster-scoped runtime names inside namespace-scoped resources can leak information across tenants; confirm that the controller cleanses these fields when multi-tenant RBAC is in effect.

Also applies to: 3342-3345

3681-3714: serviceAccountName property undocumented

v1beta1.LoggerStorageSpec.serviceAccountName lacks a description unlike its peers. Add a short docstring so the generated client & docs are self-explanatory.
         "serviceAccountName": {
-          "type": "string"
+          "description": "Kubernetes ServiceAccount used by the logger side-car (overrides default SA).",
+          "type": "string"
         }
pkg/logger/store.go (3)
38-50: Consider renaming the parameter for clarity

The parameter is named url but it's actually a string that hasn't been parsed yet. Consider renaming it to urlStr or rawURL to make this clearer.
-func GetStorageStrategy(url string) StorageStrategy {
+func GetStorageStrategy(urlStr string) StorageStrategy {
 	// http, https
 	switch {
-	case strings.HasPrefix(url, "http"): // http, https
+	case strings.HasPrefix(urlStr, "http"): // http, https
 		return HttpStorage
 
-	case strings.HasPrefix(url, "s3"): // s3, s3a
+	case strings.HasPrefix(urlStr, "s3"): // s3, s3a
 		return S3Storage
 
 	default:
 		return DefaultStorage
 	}
}
104-108: Consider a cleaner approach for protocol conversion

The current approach of appending "://" to convert scheme to Protocol is a workaround. Consider creating a dedicated mapping function or using a cleaner conversion approach.
-	// Convert to a Protocol to reuse existing types
-	if !strings.HasSuffix(scheme, "://") {
-		scheme += "://"
-	}
-	protocol := storage.Protocol(scheme)
+	// Map scheme to Protocol
+	protocol := storage.Protocol(scheme + "://")
151-151: Enhance the success log message with more context

The success message could be more informative by including details about what was uploaded.
-	s.log.Info("Successfully uploaded object to S3")
+	s.log.Infof("Successfully uploaded log to S3: bucket=%s, key=%s", bucket, objectKey)
pkg/apis/serving/v1alpha1/llm_inference_service_types.go (1)

149-172: Consider adding validation for mutually exclusive fields

The comment states that Route/Gateway and Ingress fields are mutually exclusive, but this isn't enforced at the type level. Consider adding OpenAPI schema validation or webhook validation to ensure only one routing option is specified.

chore: required by "sigs.k8s.io/gateway-api-inference-extension/api/v1alpha2" -> llm_inference_service_types.go Signed-off-by: Spolti <fspolti@redhat.com>

coderabbitai

Actionable comments posted: 1

🔭 Outside diff range comments (1)

Makefile (1)
142-150: Consider -prune for speed & clarity + cover plain venv dirs
You are already guarding against several virtual-env folders, but:

A plain venv/ (without the dot) is still common and will be scanned.

Using a chain of -not -path … keeps traversing the unwanted trees; -prune prevents the descent and is noticeably faster on large repos.

Example refactor (keeps the existing semantics):
-for file in $$(find . -type f -name "pyproject.toml" \
-  -not -path "./pyproject.toml" \
-  -not -path "*.venv/*" \
-  -not -path "*/venv/*" \
-  -not -path "*/prod_venv/*" \
-  -not -path "*/site-packages/*" \
-  -not -path "*/__pypackages__/*" \
-  -not -path "*/build/*"); do \
+for file in $$(find . $ -path "*/.venv" -o -path "*/venv" -o -path "*/prod_venv" \
+                     -o -path "*/site-packages" -o -path "*/__pypackages__" -o -path "*/build" $ -prune \
+                   -o -name "pyproject.toml" -not -path "./pyproject.toml" -print); do \
This keeps the intent while improving performance and coverage.

📜 Review details

Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between cd65275 and 767a0f9.

⛔ Files ignored due to path filters (1)

go.sum is excluded by !**/*.sum

📒 Files selected for processing (27)

.flake8 (1 hunks)
Makefile (1 hunks)
go.mod (8 hunks)
pkg/apis/serving/v1alpha1/inference_graph_validation_test.go (3 hunks)
pkg/apis/serving/v1alpha1/trainedmodel_webhook_test.go (3 hunks)
pkg/apis/serving/v1beta1/configmap_test.go (12 hunks)
pkg/apis/serving/v1beta1/inference_service_validation_test.go (22 hunks)
pkg/apis/serving/v1beta1/predictor_model_test.go (1 hunks)
pkg/controller/v1alpha1/trainedmodel/reconcilers/modelconfig/modelconfig_reconciler_test.go (6 hunks)
pkg/controller/v1beta1/inferenceservice/components/explainer.go (1 hunks)
pkg/controller/v1beta1/inferenceservice/components/transformer.go (2 hunks)
pkg/controller/v1beta1/inferenceservice/reconcilers/autoscaler/autoscaler_reconciler_test.go (1 hunks)
pkg/controller/v1beta1/inferenceservice/reconcilers/cabundleconfigmap/cabundle_configmap_reconciler_test.go (10 hunks)
pkg/controller/v1beta1/inferenceservice/reconcilers/deployment/deployment_reconciler_test.go (2 hunks)
pkg/controller/v1beta1/inferenceservice/reconcilers/hpa/hpa_reconciler_test.go (4 hunks)
pkg/controller/v1beta1/inferenceservice/reconcilers/ingress/httproute_reconciler_test.go (23 hunks)
pkg/controller/v1beta1/inferenceservice/reconcilers/ingress/ingress_reconciler_test.go (1 hunks)
pkg/controller/v1beta1/inferenceservice/reconcilers/keda/keda_reconciler_test.go (4 hunks)
pkg/controller/v1beta1/inferenceservice/reconcilers/knative/ksvc_reconciler_test.go (3 hunks)
pkg/controller/v1beta1/inferenceservice/reconcilers/otel/otel_reconciler_test.go (2 hunks)
pkg/controller/v1beta1/inferenceservice/utils/utils_test.go (6 hunks)
pkg/credentials/service_account_credentials_test.go (18 hunks)
pkg/openapi/openapi_generated.go (24 hunks)
pkg/webhook/admission/localmodelcache/local_model_cache_validation_test.go (5 hunks)
pkg/webhook/admission/pod/mutator_test.go (2 hunks)
pkg/webhook/admission/pod/storage_initializer_injector_test.go (7 hunks)
pkg/webhook/admission/servingruntime/servingruntime_webhook_test.go (1 hunks)

✅ Files skipped from review due to trivial changes (11)

pkg/controller/v1beta1/inferenceservice/reconcilers/autoscaler/autoscaler_reconciler_test.go
.flake8
pkg/controller/v1alpha1/trainedmodel/reconcilers/modelconfig/modelconfig_reconciler_test.go
pkg/controller/v1beta1/inferenceservice/reconcilers/hpa/hpa_reconciler_test.go
pkg/controller/v1beta1/inferenceservice/reconcilers/deployment/deployment_reconciler_test.go
pkg/controller/v1beta1/inferenceservice/reconcilers/cabundleconfigmap/cabundle_configmap_reconciler_test.go
pkg/apis/serving/v1beta1/configmap_test.go
pkg/controller/v1beta1/inferenceservice/reconcilers/keda/keda_reconciler_test.go
pkg/apis/serving/v1beta1/inference_service_validation_test.go
pkg/webhook/admission/pod/storage_initializer_injector_test.go
pkg/controller/v1beta1/inferenceservice/reconcilers/ingress/httproute_reconciler_test.go

🚧 Files skipped from review as they are similar to previous changes (5)

pkg/controller/v1beta1/inferenceservice/components/explainer.go
pkg/controller/v1beta1/inferenceservice/components/transformer.go
pkg/controller/v1beta1/inferenceservice/utils/utils_test.go
pkg/credentials/service_account_credentials_test.go
go.mod

🧰 Additional context used

🧠 Learnings (6)

pkg/controller/v1beta1/inferenceservice/reconcilers/ingress/ingress_reconciler_test.go (1)

Learnt from: israel-hdez
PR: opendatahub-io/kserve#738
File: pkg/apis/serving/v1beta1/configmap.go:0-0
Timestamp: 2025-07-17T16:19:36.771Z
Learning: In pkg/apis/serving/v1beta1/configmap.go, the GetStorageInitializerConfigs function was moved from another package to prevent circular imports, which explains why it may contain existing patterns rather than being newly written code.

pkg/apis/serving/v1beta1/predictor_model_test.go (1)

Learnt from: israel-hdez
PR: opendatahub-io/kserve#738
File: pkg/apis/serving/v1beta1/configmap.go:0-0
Timestamp: 2025-07-17T16:19:36.771Z
Learning: In pkg/apis/serving/v1beta1/configmap.go, the GetStorageInitializerConfigs function was moved from another package to prevent circular imports, which explains why it may contain existing patterns rather than being newly written code.

pkg/webhook/admission/pod/mutator_test.go (1)

Learnt from: israel-hdez
PR: opendatahub-io/kserve#738
File: pkg/apis/serving/v1beta1/configmap.go:0-0
Timestamp: 2025-07-17T16:19:36.771Z
Learning: In pkg/apis/serving/v1beta1/configmap.go, the GetStorageInitializerConfigs function was moved from another package to prevent circular imports, which explains why it may contain existing patterns rather than being newly written code.

pkg/apis/serving/v1alpha1/inference_graph_validation_test.go (1)

Learnt from: israel-hdez
PR: opendatahub-io/kserve#738
File: pkg/apis/serving/v1beta1/configmap.go:0-0
Timestamp: 2025-07-17T16:19:36.771Z
Learning: In pkg/apis/serving/v1beta1/configmap.go, the GetStorageInitializerConfigs function was moved from another package to prevent circular imports, which explains why it may contain existing patterns rather than being newly written code.

pkg/apis/serving/v1alpha1/trainedmodel_webhook_test.go (1)

Learnt from: israel-hdez
PR: opendatahub-io/kserve#738
File: pkg/apis/serving/v1beta1/configmap.go:0-0
Timestamp: 2025-07-17T16:19:36.771Z
Learning: In pkg/apis/serving/v1beta1/configmap.go, the GetStorageInitializerConfigs function was moved from another package to prevent circular imports, which explains why it may contain existing patterns rather than being newly written code.

pkg/openapi/openapi_generated.go (2)

undefined

<retrieved_learning>
Learnt from: israel-hdez
PR: #738
File: pkg/apis/serving/v1beta1/configmap.go:0-0
Timestamp: 2025-07-17T16:19:36.771Z
Learning: In pkg/apis/serving/v1beta1/configmap.go, the GetStorageInitializerConfigs function was moved from another package to prevent circular imports, which explains why it may contain existing patterns rather than being newly written code.
</retrieved_learning>

<retrieved_learning>
Learnt from: israel-hdez
PR: #738
File: pkg/types/config.go:30-30
Timestamp: 2025-07-17T16:20:08.917Z
Learning: In pkg/types/config.go, the StorageInitializerConfig struct field EnableOciImageSource uses the JSON tag "enableModelcar" (instead of matching the field name) to maintain backwards compatibility after the struct was moved from another package. The JSON tag cannot be changed to avoid breaking existing configurations.
</retrieved_learning>

⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (5)

GitHub Check: precommit-check
GitHub Check: test
GitHub Check: test
GitHub Check: build (3.11)
GitHub Check: test

🔇 Additional comments (17)

pkg/controller/v1beta1/inferenceservice/reconcilers/ingress/ingress_reconciler_test.go (1)

2145-2145: LGTM: Good practice using test context.

The change from context.TODO() to t.Context() is a good improvement as it provides proper context propagation tied to the test's lifecycle, enabling better timeout handling and cancellation.

pkg/webhook/admission/servingruntime/servingruntime_webhook_test.go (1)

1885-1885: LGTM! Improved context usage in test.

This change properly uses the test's context instead of a generic TODO context, which improves context propagation and aligns with testing best practices.

pkg/apis/serving/v1beta1/predictor_model_test.go (1)

493-493: LGTM! Consistent context usage improvement.

This change properly uses the test's context instead of a background context, which is consistent with the broader pattern across the codebase and improves context management in tests.

pkg/webhook/admission/pod/mutator_test.go (1)

53-53: LGTM! Comprehensive context usage improvements.

These changes consistently replace context.Background() with t.Context() across all Kubernetes client operations and the mutator's Handle method, ensuring proper context propagation throughout the test lifecycle.

Also applies to: 298-298, 306-306, 309-309

pkg/webhook/admission/localmodelcache/local_model_cache_validation_test.go (1)

108-108: LGTM! Consistent context improvements across validation tests.

These changes properly replace context.Background() and context.TODO() with t.Context() across all validator method calls (ValidateDelete, ValidateCreate, ValidateUpdate), ensuring consistent context propagation throughout the validation test suite.

Also applies to: 124-124, 156-156, 174-174, 186-186

pkg/controller/v1beta1/inferenceservice/reconcilers/otel/otel_reconciler_test.go (1)

174-174: LGTM! Improved context usage in reconciler tests.

These changes consistently replace context.TODO() with t.Context() across both reconciler operations and client calls in the test functions, ensuring proper context propagation and alignment with testing best practices.

Also applies to: 179-179, 256-256, 261-261

pkg/apis/serving/v1alpha1/inference_graph_validation_test.go (1)

273-273: LGTM! Consistent context usage improvement.

The changes properly replace context.Background() with t.Context() in all validation method calls, which ties the validation context to the test lifecycle. This enables proper timeout handling and cancellation, following Go testing best practices.

Also applies to: 313-313, 351-351

pkg/controller/v1beta1/inferenceservice/reconcilers/knative/ksvc_reconciler_test.go (1)

351-351: LGTM! Improved context handling in controller tests.

The changes consistently replace context.TODO() with t.Context() in client operations and reconciler calls. This improvement ensures proper context lifecycle management and enables timeout handling in controller tests.

Also applies to: 367-367, 378-378

pkg/apis/serving/v1alpha1/trainedmodel_webhook_test.go (1)

138-138: LGTM! Consistent context usage improvement in webhook tests.

The changes properly replace context.Background() with t.Context() in all validation method calls, ensuring that webhook validation operations are tied to the test lifecycle. This enables proper timeout handling and follows Go testing best practices.

Also applies to: 263-263, 292-292

pkg/openapi/openapi_generated.go (8)

31-47: LGTM! New LLM type registrations follow established patterns.

The additions of LLM-related OpenAPI definitions are consistent with the existing pattern and align with the introduction of new CRDs for LLM inference services.

860-1044: Schema definitions are well-structured and complete.

The new LLM-related schemas follow Kubernetes API conventions correctly:

Primary resource (LLMInferenceService) includes both spec and status

Config resource (LLMInferenceServiceConfig) appropriately omits status

List types are properly defined with required items field

All dependencies are correctly specified

2912-2912: Documentation improvements enhance clarity.

The init container description updates fix grammatical issues and improve readability.

Also applies to: 3402-3402, 3892-3892

2566-2566: Consistent refactoring of storage specifications across all model types.

The systematic replacement of StorageSpec with ModelStorageSpec across all model-serving components is well-executed. This change provides better type specificity and aligns with the introduction of separate storage types for different purposes (models vs. logs).

Also applies to: 4709-4709, 5788-5788, 6548-6548, 7224-7224, 7667-7667, 8071-8071, 8396-8396, 9263-9263, 10294-10294, 10702-10702, 11028-11028, 11921-11921, 12715-12715

6038-6051: Runtime tracking fields enhance service observability.

The addition of servingRuntimeName and clusterServingRuntimeName fields to the InferenceServiceStatus provides valuable information for tracking which runtime an InferenceService is using.

6668-6726: Logger storage configuration properly implemented.

The addition of LoggerStorageSpec and its integration into LoggerSpec enables proper configuration of storage for inference logger cloud events. The schema structure is consistent with other storage specifications while appropriately omitting model-specific fields like schemaPath.

7277-7324: ModelStorageSpec provides appropriate model-specific storage configuration.

The new ModelStorageSpec type extends the generic storage configuration with a schemaPath field, which is valuable for model schema validation scenarios. This specialization justifies the refactoring from the generic StorageSpec.

10349-10349: Documentation updates improve accuracy.

The description updates for StorageSpec make the documentation more precise by clarifying that paths are relative to the storage URI and using more generic terminology ("object" instead of specific types).

Also applies to: 10354-10354, 10377-10377

coderabbitai

Actionable comments posted: 0

♻️ Duplicate comments (2)

pkg/controller/v1beta1/inferenceservice/controller.go (2)

446-476: Issues remain in commented code.

The commented clusterServingRuntimeFunc method still contains the same issues identified in previous reviews:

Incorrect namespace filtering for cluster-scoped resources

Missing filtering to only enqueue InferenceServices that reference the specific ClusterServingRuntime

While this code is currently commented out, these issues should be addressed before uncommenting.

502-516: Field path mismatch persists.

The index field path "spec.predictor.model.runtime" still doesn't match the returned values from Status.ServingRuntimeName. This is the same issue identified in previous reviews.

🧹 Nitpick comments (1)

pkg/controller/v1beta1/inferenceservice/controller.go (1)
446-446: Fix comment formatting.

Add spaces after // in comments for consistency with Go formatting standards.
-//func (r *InferenceServiceReconciler) clusterServingRuntimeFunc(ctx context.Context, obj client.Object) []reconcile.Request {
+// func (r *InferenceServiceReconciler) clusterServingRuntimeFunc(ctx context.Context, obj client.Object) []reconcile.Request {
Apply similar formatting to other comment lines (510, 530, 598).

Also applies to: 510-510, 530-530, 598-598

📜 Review details

Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between afb2f11 and 4dff74d.

📒 Files selected for processing (1)

pkg/controller/v1beta1/inferenceservice/controller.go (7 hunks)

🧰 Additional context used

🧠 Learnings (1)

pkg/controller/v1beta1/inferenceservice/controller.go (3)

Learnt from: israel-hdez
PR: opendatahub-io/kserve#738
File: pkg/apis/serving/v1beta1/configmap.go:0-0
Timestamp: 2025-07-17T16:19:36.771Z
Learning: In pkg/apis/serving/v1beta1/configmap.go, the GetStorageInitializerConfigs function was moved from another package to prevent circular imports, which explains why it may contain existing patterns rather than being newly written code.

Learnt from: israel-hdez
PR: opendatahub-io/kserve#738
File: pkg/types/config.go:30-30
Timestamp: 2025-07-17T16:20:08.917Z
Learning: In pkg/types/config.go, the StorageInitializerConfig struct field EnableOciImageSource uses the JSON tag "enableModelcar" (instead of matching the field name) to maintain backwards compatibility after the struct was moved from another package. The JSON tag cannot be changed to avoid breaking existing configurations.

Learnt from: brettmthompson
PR: opendatahub-io/kserve#709
File: pkg/controller/v1alpha1/utils/utils.go:118-128
Timestamp: 2025-06-30T22:30:38.045Z
Learning: In pkg/controller/v1alpha1/utils/utils.go, the condition `minReplicas == nil && constants.DefaultMinReplicas == 0` in ValidateInitialScaleAnnotation function is intentionally kept even though currently unreachable (DefaultMinReplicas = 1) to handle potential future changes to DefaultMinReplicas value, implementing defensive programming to limit future code changes.

🪛 golangci-lint (1.64.8)

pkg/controller/v1beta1/inferenceservice/controller.go

446-446: commentFormatting: put a space between // and comment text

(gocritic)

510-510: commentFormatting: put a space between // and comment text

(gocritic)

530-530: commentFormatting: put a space between // and comment text

(gocritic)

598-598: commentFormatting: put a space between // and comment text

(gocritic)

🔇 Additional comments (5)

pkg/controller/v1beta1/inferenceservice/controller.go (5)

48-48: LGTM: Required import for new handler functionality.

The handler import is correctly added to support the new EnqueueRequestsFromMapFunc usage.

205-212: LGTM: Proper auto-update check implementation.

The early return logic correctly prevents unnecessary reconciliations when auto-update is disabled and the service is ready. This optimization aligns with the similar check in the servingRuntimeFunc method.

411-444: LGTM: Correct ServingRuntime change handling.

The method correctly handles namespaced ServingRuntime resources by:

Filtering InferenceServices by namespace

Respecting the auto-update annotation

Only enqueueing InferenceServices that reference the specific ServingRuntime

This implementation follows proper controller patterns.

518-527: LGTM: Appropriate predicate for spec changes.

The predicate correctly filters to only trigger reconciliation on ServingRuntime spec changes, ignoring create/delete/generic events. This is an appropriate optimization.

597-599: LGTM: Correct watch setup for ServingRuntime.

The watch configuration correctly connects the ServingRuntime resources with the mapping function and predicate. The commented ClusterServingRuntime watch is consistent with the commented implementation.

chore: added as part of kserve@a070e2c Signed-off-by: Spolti <fspolti@redhat.com>

coderabbitai

Actionable comments posted: 1

♻️ Duplicate comments (2)

pkg/controller/v1beta1/inferenceservice/rawkube_controller_test.go (2)
4022-4022: Fix incorrect comment.

The comment incorrectly states auto-update is disabled when it's actually enabled (annotation value is "false" which means disable auto-update is false, i.e., auto-update is enabled).
-			// Define InferenceService with auto-update disabled.
+			// Define InferenceService with auto-update enabled.
4379-4379: Fix incorrect comment (duplicate issue).

Same issue as above - the comment incorrectly states auto-update is disabled when it's actually enabled.
-			// Define InferenceService with auto-update disabled.
+			// Define InferenceService with auto-update enabled.

📜 Review details

Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between b60968a and efecad2.

📒 Files selected for processing (3)

pkg/apis/serving/v1alpha1/servingruntime_types.go (1 hunks)
pkg/controller/v1beta1/inferenceservice/controller.go (7 hunks)
pkg/controller/v1beta1/inferenceservice/rawkube_controller_test.go (17 hunks)

🚧 Files skipped from review as they are similar to previous changes (1)

pkg/controller/v1beta1/inferenceservice/controller.go

🧰 Additional context used

🧠 Learnings (2)

pkg/apis/serving/v1alpha1/servingruntime_types.go (2)

Learnt from: israel-hdez
PR: #738
File: pkg/apis/serving/v1beta1/configmap.go:0-0
Timestamp: 2025-07-17T16:19:36.771Z
Learning: In pkg/apis/serving/v1beta1/configmap.go, the GetStorageInitializerConfigs function was moved from another package to prevent circular imports, which explains why it may contain existing patterns rather than being newly written code.

Learnt from: brettmthompson
PR: #709
File: pkg/controller/v1alpha1/utils/utils.go:118-128
Timestamp: 2025-06-30T22:30:38.045Z
Learning: In pkg/controller/v1alpha1/utils/utils.go, the condition minReplicas == nil && constants.DefaultMinReplicas == 0 in ValidateInitialScaleAnnotation function is intentionally kept even though currently unreachable (DefaultMinReplicas = 1) to handle potential future changes to DefaultMinReplicas value, implementing defensive programming to limit future code changes.

pkg/controller/v1beta1/inferenceservice/rawkube_controller_test.go (3)

Learnt from: israel-hdez
PR: #738
File: pkg/apis/serving/v1beta1/configmap.go:0-0
Timestamp: 2025-07-17T16:19:36.771Z
Learning: In pkg/apis/serving/v1beta1/configmap.go, the GetStorageInitializerConfigs function was moved from another package to prevent circular imports, which explains why it may contain existing patterns rather than being newly written code.

Learnt from: israel-hdez
PR: #738
File: pkg/types/config.go:30-30
Timestamp: 2025-07-17T16:20:08.917Z
Learning: In pkg/types/config.go, the StorageInitializerConfig struct field EnableOciImageSource uses the JSON tag "enableModelcar" (instead of matching the field name) to maintain backwards compatibility after the struct was moved from another package. The JSON tag cannot be changed to avoid breaking existing configurations.

Learnt from: brettmthompson
PR: #709
File: pkg/controller/v1alpha1/utils/utils.go:118-128
Timestamp: 2025-06-30T22:30:38.045Z
Learning: In pkg/controller/v1alpha1/utils/utils.go, the condition minReplicas == nil && constants.DefaultMinReplicas == 0 in ValidateInitialScaleAnnotation function is intentionally kept even though currently unreachable (DefaultMinReplicas = 1) to handle potential future changes to DefaultMinReplicas value, implementing defensive programming to limit future code changes.

🔇 Additional comments (9)

pkg/apis/serving/v1alpha1/servingruntime_types.go (1)

284-285: Dependent Code Verified – No Active ClusterServingRuntime Registrations

A project-wide search confirms that all controllers, webhooks, clients, informers, and tests referencing ClusterServingRuntime are commented out or removed from the active code paths. No unregistered types are used at runtime.

• SchemeBuilder only registers ServingRuntime types.
• Controllers/watchers for ClusterServingRuntime are entirely commented out.
• GetServingRuntime falls back exclusively to namespace-scoped ServingRuntime.
• Webhook handlers and informer/client code for ClusterServingRuntime remain disabled.
• Tests for ClusterServingRuntime are likewise commented out.

No further action is needed here.

pkg/controller/v1beta1/inferenceservice/rawkube_controller_test.go (8)

63-63: Good addition of consistency timeout constant.

The new consistentlyTimeout constant is appropriately set to 5 seconds for consistency checks in the test suite.

587-587: Correctly updated status expectation.

The addition of ServingRuntimeName field to the expected status is consistent with the new functionality for tracking which serving runtime is being used.

1147-1147: Consistent status field updates.

The addition of ServingRuntimeName fields across multiple test expectations correctly reflects the new status tracking functionality.

Also applies to: 1703-1703

2211-2211: Good typo correction.

Fixed the typo from "Kubernestes" to "Kubernetes" in the comment.

3805-3949: Well-structured test for auto-update behavior when annotation is missing.

The test correctly verifies that when the auto-update annotation is not present, the deployment gets updated when the serving runtime changes. The test flow is logical and uses appropriate assertions.

4094-4253: Proper negative test case with consistency check.

This test correctly verifies that when auto-update is disabled ("true" value), the deployment labels remain unchanged even when the serving runtime is updated. The use of Consistently with consistentlyTimeout is appropriate for this negative assertion.

4254-4484: Good test for selective reconciliation with multiple serving runtimes.

The test properly validates that only the deployment associated with the updated serving runtime gets reconciled, while other deployments remain unchanged. This ensures the controller correctly maps serving runtime changes to the appropriate inference services.

4830-4830: Consistent ServingRuntimeName additions across all test expectations.

All test cases have been properly updated to include the ServingRuntimeName field in their expected status, maintaining consistency across the test suite.

Also applies to: 5407-5407, 6251-6251, 7204-7204, 7896-7896, 8790-8790, 9835-9835, 10727-10727, 11249-11249

spolti · 2025-07-21T17:26:01Z

/test e2e-predictor

VedantMahabaleshwarkar · 2025-07-22T15:58:55Z

/approve

mholder6 · 2025-07-22T16:47:37Z

/lgtm /approve

mholder6 · 2025-07-23T15:21:15Z

/lgtm
/approve

openshift-ci · 2025-07-23T15:21:25Z

[APPROVALNOTIFIER] This PR is APPROVED

This pull-request has been approved by: mholder6, spolti, VedantMahabaleshwarkar

The full list of commands accepted by this bot can be found here.

The pull request process is described here

Details

Needs approval from an approver in each of these files:

~~OWNERS~~ [VedantMahabaleshwarkar,mholder6,spolti]

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

pierDipi and others added 9 commits June 25, 2025 21:02

[API] Define LLMInferenceService and LLMInferenceServiceConfig types …

40806a7

…and CRDs (kserve#4522) Signed-off-by: Pierangelo Di Pilato <pierdipi@redhat.com> Co-authored-by: Jooho Lee <jlee@redhat.com>

Stop and Resume a transformer (kserve#4534)

ce43040

Signed-off-by: Hannah DeFazio <h2defazio@gmail.com> Co-authored-by: Jooho Lee <jlee@redhat.com>

feat: Store Inference logging to blob storage (kserve#4473)

9e55d59

Signed-off-by: cory-johannsen <cjohannsen@cloudera.com> Signed-off-by: cjohannsen-cloudera <cjohannsen@cloudera.com> Co-authored-by: Sivanantham <90966311+sivanantha321@users.noreply.github.com>

Fix outdated BentoML import in sample code (BentoService no longer av…

2f338a6

…ailable in v1.x) (kserve#4540) Signed-off-by: YehCC52 <hometown456@gmail.com>

Auto-update annotation for isvc. (kserve#4342)

a070e2c

Signed-off-by: Andres Llausas <allausas@redhat.com> Co-authored-by: Dan Sun <dsun20@bloomberg.net>

Stop and resume an explainer (kserve#4546)

fba4113

Signed-off-by: Hannah DeFazio <h2defazio@gmail.com>

fix: unset clampMax and clampMin, since they are not for replicas (ks…

178235a

…erve#4556) Signed-off-by: Vincent Hou <shou73@bloomberg.net>

github-project-automation Bot added this to ODH Model Serving Planning Jul 17, 2025

github-project-automation Bot moved this to New/Backlog in ODH Model Serving Planning Jul 17, 2025

openshift-ci Bot added the approved label Jul 17, 2025

coderabbitai Bot reviewed Jul 17, 2025

View reviewed changes

spolti force-pushed the 250717_sync_master branch from 9f4edf6 to 55359b2 Compare July 17, 2025 20:27

Update golang to 1.24

cd65275

chore: required by "sigs.k8s.io/gateway-api-inference-extension/api/v1alpha2" -> llm_inference_service_types.go Signed-off-by: Spolti <fspolti@redhat.com>

spolti force-pushed the 250717_sync_master branch from 55359b2 to cd65275 Compare July 17, 2025 20:32

updated Makefile and .flake8 to ignore dirs/make precommit/generate

767a0f9

coderabbitai Bot reviewed Jul 18, 2025

View reviewed changes

Comment thread Makefile Outdated

updated Makefile to ignore dirs/make precommit/generate

afb2f11

coderabbitai Bot reviewed Jul 18, 2025

View reviewed changes

spolti force-pushed the 250717_sync_master branch from 4dff74d to b60968a Compare July 18, 2025 18:44

Do not register ClusterServingRuntime

efecad2

chore: added as part of kserve@a070e2c Signed-off-by: Spolti <fspolti@redhat.com>

spolti force-pushed the 250717_sync_master branch from b60968a to efecad2 Compare July 21, 2025 14:53

coderabbitai Bot reviewed Jul 21, 2025

View reviewed changes

Comment thread pkg/controller/v1beta1/inferenceservice/rawkube_controller_test.go

Merge branch 'master' into 250717_sync_master

a843f5f

spolti added the tide/merge-method-merge Change lgtm to use a merge commit label Jul 22, 2025

openshift-ci Bot assigned mholder6 Jul 23, 2025

openshift-ci Bot added the lgtm label Jul 23, 2025

openshift-merge-bot Bot merged commit 36f1080 into opendatahub-io:master Jul 23, 2025
30 of 33 checks passed

github-project-automation Bot moved this from New/Backlog to Done in ODH Model Serving Planning Jul 23, 2025

spolti deleted the 250717_sync_master branch July 23, 2025 15:42

This was referenced Jul 31, 2025

250731 sync kserve/master into odh/master batch1 #805

Merged

250731 sync kserve/master into odh/master batch3 #820

Merged

Conversation

spolti commented Jul 17, 2025 • edited by coderabbitai Bot Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Summary by CodeRabbit

Uh oh!

coderabbitai Bot commented Jul 17, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Walkthrough

Changes

Sequence Diagram(s)

Estimated code review effort

Possibly related PRs

Suggested labels

Suggested reviewers

Poem

Chat

Support

CodeRabbit Commands (Invoked using PR comments)

Other keywords and placeholders

CodeRabbit Configuration File (.coderabbit.yaml)

Documentation and Community

Uh oh!

coderabbitai Bot left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

coderabbitai Bot left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

coderabbitai Bot left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

coderabbitai Bot left a comment

Choose a reason for hiding this comment

Uh oh!

coderabbitai Bot left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

spolti commented Jul 21, 2025

Uh oh!

VedantMahabaleshwarkar commented Jul 22, 2025

Uh oh!

mholder6 commented Jul 22, 2025

Uh oh!

mholder6 commented Jul 23, 2025

Uh oh!

openshift-ci Bot commented Jul 23, 2025

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

9 participants

spolti commented Jul 17, 2025 •

edited by coderabbitai Bot

Loading

coderabbitai Bot commented Jul 17, 2025 •

edited

Loading

CodeRabbit Configuration File (`.coderabbit.yaml`)