250717 sync master#752
Conversation
…and CRDs (kserve#4522) Signed-off-by: Pierangelo Di Pilato <pierdipi@redhat.com> Co-authored-by: Jooho Lee <jlee@redhat.com>
Signed-off-by: Hannah DeFazio <h2defazio@gmail.com> Co-authored-by: Jooho Lee <jlee@redhat.com>
Signed-off-by: Spolti <fspolti@redhat.com> Signed-off-by: Andres Llausas <allausas@redhat.com> Signed-off-by: Edgar Hernández <23639005+israel-hdez@users.noreply.github.com> Signed-off-by: tarilabs <matteo.mortari@gmail.com> Signed-off-by: Sivanantham Chinnaiyan <sivanantham.chinnaiyan@ideas2it.com> Signed-off-by: Brett Thompson <196701379+brettmthompson@users.noreply.github.com> Signed-off-by: Jin Dong <jdong183@bloomberg.net> Signed-off-by: Dan Sun <dsun20@bloomberg.net> Signed-off-by: Abolfazl Shahbazi <12436063+ashahba@users.noreply.github.com> Co-authored-by: Filippe Spolti <fspolti@redhat.com> Co-authored-by: Andres Llausas <allausas@redhat.com> Co-authored-by: Matteo Mortari <matteo.mortari@gmail.com> Co-authored-by: Dan Sun <dsun20@bloomberg.net> Co-authored-by: Sivanantham <90966311+sivanantha321@users.noreply.github.com> Co-authored-by: Brett Thompson <196701379+brettmthompson@users.noreply.github.com> Co-authored-by: Jin Dong <greenmoon55@users.noreply.github.com> Co-authored-by: Abolfazl Shahbazi <12436063+ashahba@users.noreply.github.com> Co-authored-by: Jooho Lee <jlee@redhat.com>
Signed-off-by: cory-johannsen <cjohannsen@cloudera.com> Signed-off-by: cjohannsen-cloudera <cjohannsen@cloudera.com> Co-authored-by: Sivanantham <90966311+sivanantha321@users.noreply.github.com>
…ailable in v1.x) (kserve#4540) Signed-off-by: YehCC52 <hometown456@gmail.com>
Signed-off-by: Andres Llausas <allausas@redhat.com> Co-authored-by: Dan Sun <dsun20@bloomberg.net>
Signed-off-by: Hannah DeFazio <h2defazio@gmail.com>
…erve#4556) Signed-off-by: Vincent Hou <shou73@bloomberg.net>
This batch includes the following commits: 178235a fix: unset clampMax and clampMin, since they are not for replicas (kserve#4556) fba4113 Stop and resume an explainer (kserve#4546) a070e2c Auto-update annotation for isvc. (kserve#4342) 2f338a6 Fix outdated BentoML import in sample code (BentoService no longer available in v1.x) (kserve#4540) 9e55d59 feat: Store Inference logging to blob storage (kserve#4473) b1bae1b Allow OCI for multi-node/multi-gpu (kserve#4441) ce43040 Stop and Resume a transformer (kserve#4534) 40806a7 [API] Define LLMInferenceService and LLMInferenceServiceConfig types and CRDs (kserve#4522)
WalkthroughThis update introduces new Kubernetes CustomResourceDefinitions (CRDs) and API types for managing LLMInferenceService and LLMInferenceServiceConfig resources, including their specifications, status, and list types. It also adds Python SDK models and documentation for these resources, and updates OpenAPI schemas to support the new LLM inference service functionality. Changes
Sequence Diagram(s)sequenceDiagram
participant User
participant Kubernetes API Server
participant Controller
participant LLMInferenceService CRD
User->>Kubernetes API Server: Create LLMInferenceService resource
Kubernetes API Server->>LLMInferenceService CRD: Store resource
Kubernetes API Server->>Controller: Notify of new LLMInferenceService
Controller->>Kubernetes API Server: Read LLMInferenceService spec
Controller->>Kubernetes API Server: Create/manage Deployments, Services, etc.
Controller-->>Kubernetes API Server: Update LLMInferenceService status
Estimated code review effort3 (~45 minutes) — The changes span multiple core API definitions, CRDs, deep copy methods, Python SDK models, and documentation, requiring careful review of schema correctness and consistency. Possibly related PRs
Suggested labels
Suggested reviewers
Poem
Warning There were issues while running some tools. Please review the errors and either fix the tool's configuration or disable the tool if it's a critical failure. 🔧 ast-grep (0.38.6)pkg/controller/v1beta1/inferenceservice/rawkube_controller_test.go✨ Finishing Touches
Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out. 🪧 TipsChatThere are 3 ways to chat with CodeRabbit:
SupportNeed help? Create a ticket on our support page for assistance with any issues or questions. Note: Be mindful of the bot's finite context window. It's strongly recommended to break down tasks such as reading entire modules into smaller chunks. For a focused discussion, use review comments to chat about specific files and their changes, instead of using the PR comments. CodeRabbit Commands (Invoked using PR comments)
Other keywords and placeholders
CodeRabbit Configuration File (
|
There was a problem hiding this comment.
Actionable comments posted: 10
🔭 Outside diff range comments (3)
pkg/client/clientset/versioned/fake/clientset_generated.go (1)
54-66: Fix the typo in variable name.The enhancement to extract and forward ListOptions from watch actions is a good improvement for more precise watch behavior in tests. However, there's a typo in the variable name.
- if watchActcion, ok := action.(testing.WatchActionImpl); ok { + if watchAction, ok := action.(testing.WatchActionImpl); ok { - opts = watchActcion.ListOptions + opts = watchAction.ListOptionsconfig/crd/full/serving.kserve.io_clusterservingruntimes.yaml (2)
2717-2779: Duplicate schema error – same misplacement inworkerSpec.containersThe identical
stopSignalinsertion underlifecyclerecurs here. After updating the maincontainersdefinition, replicate the correction forworkerSpec.containersto avoid schema drift.- preStop: … - stopSignal: - type: string - type: object + preStop: … + type: object + stopSignal: + type: stringKeep the two container schemas consistent.
736-798: RealignstopSignaloutside oflifecyclein the CRDThe
stopSignalproperty is currently nested underlifecycle.properties(lines ~796–798), but according to upstream Kubernetes (v1.29+), it must be a top-level field of theContainerobject, alongsidelifecycle, not inside it.Please update
config/crd/full/serving.kserve.io_clusterservingruntimes.yamlas follows:• File: config/crd/full/serving.kserve.io_clusterservingruntimes.yaml
Affected block around lines 674–798.Proposed diff:
… properties: imagePullPolicy: type: string - lifecycle: - properties: - postStart: … - preStop: … - stopSignal: - type: string - type: object + lifecycle: + properties: + postStart: … + preStop: … + type: object + stopSignal: + type: string …• Afterwards, verify the controller’s pod-spec construction to ensure it copies
stopSignalfrom the CRD into the generatedcorev1.Container.
Failing to move this now will cause valid Kubernetes manifests to be rejected and users’stopSignalsettings to be dropped at runtime.
♻️ Duplicate comments (40)
pkg/controller/v1beta1/inferenceservice/rawkube_controller_test.go (2)
1146-1148: Consistent status field additions.Same pattern as previously reviewed - correctly adds the new status fields.
4828-4830: Consistent status field additions.Same pattern as previously reviewed.
config/crd/full/serving.kserve.io_inferenceservices.yaml (23)
1601-1603: SamestopSignalimprovement applies herePlease replicate the description/validation addition suggested in the previous comment.
2357-2359: SamestopSignalimprovement applies here
4772-4774: SamestopSignalimprovement applies here
5518-5520: SamestopSignalimprovement applies here
6217-6219: SamestopSignalimprovement applies here
6902-6904: SamestopSignalimprovement applies here
7374-7386: Replicatelogger.storagevalidationSee first
logger.storagecomment — the same required-field/description feedback applies here.
7629-7631: SamestopSignalimprovement applies here
8336-8338: SamestopSignalimprovement applies here
9038-9040: SamestopSignalimprovement applies here
9727-9729: SamestopSignalimprovement applies here
10423-10425: SamestopSignalimprovement applies here
11291-11293: SamestopSignalimprovement applies here
11982-11984: SamestopSignalimprovement applies here
12751-12753: SamestopSignalimprovement applies here
14668-14670: SamestopSignalimprovement applies here
15377-15379: SamestopSignalimprovement applies here
16094-16096: SamestopSignalimprovement applies here
17832-17834: SamestopSignalimprovement applies here
19144-19146: SamestopSignalimprovement applies here
19900-19902: SamestopSignalimprovement applies here
20368-20380: Replicatelogger.storagevalidationSee earlier
logger.storagecomment — same feedback applies.
21628-21630: Do the same forservingRuntimeNameAdd a
printColumnanddescription, mirroring the recommendation forclusterServingRuntimeName.pkg/openapi/openapi_generated.go (14)
3397-3397: Consistent init containers documentation across schemas.
3887-3887: Consistent init containers documentation maintained.
5782-5782: ModelStorageSpec migration continues consistently.Also applies to: 5789-5789
6541-6541: Storage spec migration pattern maintained.Also applies to: 6548-6548
7215-7215: Storage spec migration continues.Also applies to: 7223-7223
7658-7658: Consistent storage spec update.Also applies to: 7665-7665
8061-8061: Storage spec migration maintained.Also applies to: 8068-8068
8385-8385: Consistent ModelStorageSpec reference.Also applies to: 8392-8392
9251-9251: Storage spec update applied.Also applies to: 9258-9258
10281-10281: ModelStorageSpec migration continues.Also applies to: 10288-10288
10363-10363: Key field documentation clarified.
10688-10688: Storage spec migration applied.Also applies to: 10695-10695
11013-11013: Consistent storage spec update.Also applies to: 11020-11020
11905-11905: ModelStorageSpec reference updated.Also applies to: 11912-11912
pkg/apis/serving/v1alpha1/zz_generated.deepcopy.go (1)
742-751: Deepcopy complexity due to pointer to map.The deepcopy implementation for
Parametersfield requires nested nil checks due to the pointer to map pattern. This reinforces the suggestion to simplify the field to justmap[string]string.
🧹 Nitpick comments (29)
python/kserve/docs/V1beta1CustomTransformer.md (1)
22-22: Docs wording still quite verbose
LanguageTool flags “prior to”, “taken into account”, and “in a similar fashion” as wordy. If you ever touch the generator, consider trimming those to “before”, “considered”, and “similarly” for brevity.python/kserve/kserve/models/v1beta1_custom_explainer.py (2)
610-611: Same verbose phrasing carried over from generator
The docstring still uses the wordy trio (“prior to”, “taken into account”, “in a similar fashion”). Optional: simplify to “before”, “considered”, “similarly” to align with concise style elsewhere.
621-622: Duplicate comment – applies to setter docstring as well
Setter docstring repeats the exact sentence; same wording suggestion as above. If this file is regenerated, fix in the template to avoid repeating manual edits.python/kserve/docs/V1beta1ModelStorageSpec.md (1)
4-4: Add blank lines around table for better formattingThe table should be surrounded by blank lines for better readability and to follow markdown best practices.
## Properties + Name | Type | Description | Notes ------------ | ------------- | ------------- | ------------- **key** | **str** | The Storage Key in the secret for this object. | [optional] **parameters** | **dict(str, str)** | Parameters to override the default storage credentials and config. | [optional] **path** | **str** | The path to the object in the storage. Note that this path is relative to the storage URI. | [optional] **schema_path** | **str** | The path to the model schema file in the storage. | [optional] +python/kserve/docs/V1alpha1LLMInferenceServiceList.md (1)
1-15: Fix markdown formatting issues.The documentation content is correct, but there are several formatting issues that should be addressed:
Apply this diff to fix the markdown formatting:
# V1alpha1LLMInferenceServiceList LLMInferenceServiceList is the list type for LLMInferenceService. + ## Properties Name | Type | Description | Notes ------------ | ------------- | ------------- | ------------- -**api_version** | **str** | APIVersion defines the versioned schema of this representation of an object. Servers should convert recognized schemas to the latest internal value, and may reject unrecognized values. More info: https://git.k8s.io/community/contributors/devel/sig-architecture/api-conventions.md#resources | [optional] -**items** | [**list[V1alpha1LLMInferenceService]**](V1alpha1LLMInferenceService.md) | | -**kind** | **str** | Kind is a string value representing the REST resource this object represents. Servers may infer this from the endpoint the client submits requests to. Cannot be updated. In CamelCase. More info: https://git.k8s.io/community/contributors/devel/sig-architecture/api-conventions.md#types-kinds | [optional] -**metadata** | [**V1ListMeta**](https://github.com/kubernetes-client/python/blob/master/kubernetes/docs/V1ListMeta.md) | | [optional] +**api_version** | **str** | APIVersion defines the versioned schema of this representation of an object. Servers should convert recognized schemas to the latest internal value, and may reject unrecognized values. More info: <https://git.k8s.io/community/contributors/devel/sig-architecture/api-conventions.md#resources> | [optional] +**items** | [**list[V1alpha1LLMInferenceService]**](V1alpha1LLMInferenceService.md) | | +**kind** | **str** | Kind is a string value representing the REST resource this object represents. Servers may infer this from the endpoint the client submits requests to. Cannot be updated. In CamelCase. More info: <https://git.k8s.io/community/contributors/devel/sig-architecture/api-conventions.md#types-kinds> | [optional] +**metadata** | [**V1ListMeta**](https://github.com/kubernetes-client/python/blob/master/kubernetes/docs/V1ListMeta.md) | | [optional] + [[Back to Model list]](../README.md#documentation-for-models) [[Back to API list]](../README.md#documentation-for-api-endpoints) [[Back to README]](../README.md) -python/kserve/docs/V1beta1LoggerStorageSpec.md (1)
3-10: Fix table formatting and add missing description.The documentation has a formatting issue and the
service_account_namefield lacks a description.Apply this diff to fix the formatting:
# V1beta1LoggerStorageSpec + ## Properties Name | Type | Description | Notes ------------ | ------------- | ------------- | ------------- **key** | **str** | The Storage Key in the secret for this object. | [optional] **parameters** | **dict(str, str)** | Parameters to override the default storage credentials and config. | [optional] **path** | **str** | The path to the object in the storage. Note that this path is relative to the storage URI. | [optional] -**service_account_name** | **str** | | [optional] +**service_account_name** | **str** | The name of the service account to use for storage authentication. | [optional] +python/kserve/docs/V1beta1StorageSpec.md (1)
5-5: Fix markdown table formatting.The markdown table should be surrounded by blank lines for proper rendering.
**path** | **str** | The path to the object in the storage. Note that this path is relative to the storage URI. | [optional] + Name | Type | Description | Notes + [[Back to Model list]](../README.md#documentation-for-models) [[Back to API list]](../README.md#documentation-for-api-endpoints) [[Back to README]](../README.md)python/kserve/kserve/models/v1beta1_logger_spec.py (1)
158-177: Storage property implementation follows established patterns.The getter and setter methods for the
storageproperty are implemented consistently with other properties in this class. However, the docstring is missing a description of what the storage property represents.Consider adding a description to the docstring similar to other properties:
- """Gets the storage of this V1beta1LoggerSpec. # noqa: E501 - - - :return: The storage of this V1beta1LoggerSpec. # noqa: E501 + """Gets the storage of this V1beta1LoggerSpec. # noqa: E501 + + Storage configuration for inference logger cloud events. # noqa: E501 + + :return: The storage of this V1beta1LoggerSpec. # noqa: E501And similarly for the setter:
- """Sets the storage of this V1beta1LoggerSpec. - - - :param storage: The storage of this V1beta1LoggerSpec. # noqa: E501 + """Sets the storage of this V1beta1LoggerSpec. + + Storage configuration for inference logger cloud events. # noqa: E501 + + :param storage: The storage of this V1beta1LoggerSpec. # noqa: E501python/kserve/docs/V1alpha1LLMInferenceServiceConfigList.md (1)
1-15: Documentation content is accurate and follows standard patterns.The documentation properly describes the
LLMInferenceServiceConfigListtype with standard Kubernetes list properties and appropriate links to related documentation.Address the markdown formatting issues identified by static analysis:
+ ## Properties + Name | Type | Description | Notes ------------ | ------------- | ------------- | ------------- -**api_version** | **str** | APIVersion defines the versioned schema of this representation of an object. Servers should convert recognized schemas to the latest internal value, and may reject unrecognized values. More info: https://git.k8s.io/community/contributors/devel/sig-architecture/api-conventions.md#resources | [optional] -**items** | [**list[V1alpha1LLMInferenceServiceConfig]**](V1alpha1LLMInferenceServiceConfig.md) | | -**kind** | **str** | Kind is a string value representing the REST resource this object represents. Servers may infer this from the endpoint the client submits requests to. Cannot be updated. In CamelCase. More info: https://git.k8s.io/community/contributors/devel/sig-architecture/api-conventions.md#types-kinds | [optional] -**metadata** | [**V1ListMeta**](https://github.com/kubernetes-client/python/blob/master/kubernetes/docs/V1ListMeta.md) | | [optional] +**api_version** | **str** | APIVersion defines the versioned schema of this representation of an object. Servers should convert recognized schemas to the latest internal value, and may reject unrecognized values. More info: <https://git.k8s.io/community/contributors/devel/sig-architecture/api-conventions.md#resources> | [optional] +**items** | [**list[V1alpha1LLMInferenceServiceConfig]**](V1alpha1LLMInferenceServiceConfig.md) | | +**kind** | **str** | Kind is a string value representing the REST resource this object represents. Servers may infer this from the endpoint the client submits requests to. Cannot be updated. In CamelCase. More info: <https://git.k8s.io/community/contributors/devel/sig-architecture/api-conventions.md#types-kinds> | [optional] +**metadata** | [**V1ListMeta**](<https://github.com/kubernetes-client/python/blob/master/kubernetes/docs/V1ListMeta.md>) | | [optional] + +[[Back to Model list]](../README.md#documentation-for-models) [[Back to API list]](../README.md#documentation-for-api-endpoints) [[Back to README]](../README.md) -[[Back to Model list]](../README.md#documentation-for-models) [[Back to API list]](../README.md#documentation-for-api-endpoints) [[Back to README]](../README.md) -This addresses:
- Missing blank lines around the table
- Bare URLs that should be wrapped in angle brackets
python/kserve/docs/V1alpha1LLMInferenceServiceConfig.md (1)
1-14: LGTM! Clear documentation with minor formatting improvements needed.The documentation provides a clear description of the LLMInferenceServiceConfig API resource with proper property descriptions and navigation links.
Consider addressing the markdown formatting issues flagged by static analysis:
+ ## Properties + Name | Type | Description | NotesAlso consider using reference-style links for the long URLs in the table to improve readability:
-**api_version** | **str** | APIVersion defines the versioned schema of this representation of an object. Servers should convert recognized schemas to the latest internal value, and may reject unrecognized values. More info: https://git.k8s.io/community/contributors/devel/sig-architecture/api-conventions.md#resources | [optional] +**api_version** | **str** | APIVersion defines the versioned schema of this representation of an object. Servers should convert recognized schemas to the latest internal value, and may reject unrecognized values. More info: [API Conventions][api-conventions] | [optional] +[api-conventions]: https://git.k8s.io/community/contributors/devel/sig-architecture/api-conventions.md#resourcespkg/controller/v1beta1/inferenceservice/components/predictor.go (1)
345-352: LGTM - Consistent runtime type handlingThe code correctly applies the same mutual exclusivity logic for auto-selected runtimes as used for explicitly specified runtimes. The status fields are properly maintained to reflect the runtime type.
Note: There's a potential minor optimization opportunity here as
GetServingRuntimeis called twice for the same runtime (once during selection and once here for the boolean), but this doesn't affect correctness.config/crd/full/serving.kserve.io_servingruntimes.yaml (2)
796-798: Document & validate the newly-addedstopSignalfield
stopSignalis introduced as a plain string, but its accepted values and purpose are not obvious. Tightening the schema will improve UX and prevent typos.stopSignal: + description: "POSIX signal (e.g. SIGTERM) that the runtime controller should send to gracefully terminate the container." + pattern: "^SIG[A-Z0-9]+$" type: stringAdding a description and a simple pattern (or an
enumif only a few signals are allowed) keeps the field optional while making its intent self-evident.
2777-2779: Apply the same improvements to the worker container schemaThe
workerSpec.containersstanza now mirrors the main container spec. Ensure any description / validation you add for the primarystopSignalis duplicated here, and confirm your reconciliation logic actually propagatesworkerSpec.stopSignaldown to the generated Pod template.go.mod (1)
173-182: Document the rationale for dependency pinning.The exclude directive and replace directive are used to resolve version conflicts. While the comments explain the immediate reason, consider adding more context about why KEDA requires 0.19 and GIE uses 0.21 of controller-runtime.
// google.golang.org/grpc/stats/opentelemetry is used by the keda package. // To avoid ambiguous import error due to multiple module versions found for the same import path, // we force the version in google.golang.org/grpc (v1.71.0). // Instead of replacing the path directly, exclude the problematic package // This avoids "module used for two different module paths" error exclude google.golang.org/grpc/stats/opentelemetry v0.0.0-00010101000000-000000000000 -// Currently, KEDA requires 0.19 but GIE uses 0.21 +// Currently, KEDA v2.16.1 requires controller-runtime 0.19.x for compatibility with its internal APIs, +// while gateway-api-inference-extension (GIE) v0.3.0 uses 0.21.x. We pin to 0.19.7 to maintain +// compatibility with KEDA until it supports newer controller-runtime versions. replace sigs.k8s.io/controller-runtime => sigs.k8s.io/controller-runtime v0.19.7pkg/constants/constants.go (1)
125-129: Consider grouping related annotation keys.The new annotation keys are well-named and follow the existing pattern. However, for better organization, consider grouping all logger-related annotation keys together.
PrometheusPathAnnotationKey = "prometheus.io/path" StorageReadonlyAnnotationKey = "storage.kserve.io/readonly" DefaultPrometheusPath = "/metrics" QueueProxyAggregatePrometheusMetricsPort = "9088" DefaultPodPrometheusPort = "9091" NodeGroupAnnotationKey = KServeAPIGroupName + "/nodegroup" + // Logger-related annotations LoggerSecretNameKey = KServeAPIGroupName + "/logger-secret-name" LoggerCredentialPathKey = KServeAPIGroupName + "/logger-secret-path" LoggerCredentialFileKey = KServeAPIGroupName + "/logger-secret-file" + // Auto-update control DisableAutoUpdateAnnotationKey = KServeAPIGroupName + "/disable-auto-update"pkg/logger/mock.go (1)
26-36: Consider adding buffer to the response channel.The unbuffered channel could cause blocking if the test doesn't consume messages immediately.
func NewMockStore(storageSpec *v1beta1.StorageSpec) *MockStore { return &MockStore{ StorageSpec: storageSpec, - ResponseChan: make(chan *LogRequest), + ResponseChan: make(chan *LogRequest, 100), // Buffer to prevent blocking } }python/kserve/kserve/models/v1beta1_logger_storage_spec.py (2)
157-157: Add missing docstring for service_account_name property.The getter method for
service_account_nameis missing its docstring while all other properties have proper documentation.- + The service account name for accessing the storage. # noqa: E501
63-83: Consider adding type hints for better IDE support.While this is auto-generated code, adding type hints would improve developer experience.
- def __init__(self, key=None, parameters=None, path=None, service_account_name=None, local_vars_configuration=None): # noqa: E501 + def __init__(self, key: str = None, parameters: dict = None, path: str = None, service_account_name: str = None, local_vars_configuration=None): # noqa: E501pkg/controller/v1beta1/inferenceservice/controller.go (1)
530-530: Remove or implement the TODO comment.The TODO comment suggests finding a way to distinguish ServingRuntime types, but the current implementation already handles them separately with different handlers.
- // TODO: Find a way to distinguish if the ServingRuntime is a ClusterServingRuntime or notpkg/controller/v1beta1/inferenceservice/rawkube_controller_test.go (1)
3776-4484: Consider refactoring repetitive test setup code.The test cases contain significant duplication in ServingRuntime and InferenceService setup. Consider extracting common setup into helper functions to improve maintainability.
Example helper functions:
func createTestServingRuntime(name, namespace, framework string) *v1alpha1.ServingRuntime { return &v1alpha1.ServingRuntime{ ObjectMeta: metav1.ObjectMeta{ Name: name, Namespace: namespace, }, Spec: v1alpha1.ServingRuntimeSpec{ SupportedModelFormats: []v1alpha1.SupportedModelFormat{ { Name: framework, Version: proto.String("1"), AutoSelect: proto.Bool(true), }, }, ServingRuntimePodSpec: v1alpha1.ServingRuntimePodSpec{ Labels: map[string]string{ "key1": "val1FromSR", "key2": "val2FromSR", "key3": "val3FromSR", }, // ... rest of the spec }, Disabled: proto.Bool(false), }, } } func createTestInferenceService(name, namespace string, autoUpdateValue string) *v1beta1.InferenceService { // Similar helper for InferenceService creation }config/crd/full/serving.kserve.io_inferenceservices.yaml (3)
738-740: Add description & regex validation for newstopSignalfieldOnly the
typeis declared. Adding a shortdescriptionplus a conservative regex such as'^SIG[A-Z]+$'(or anenumwith the signals you support) will self-document intent and prevent clearly invalid input at admission time.stopSignal: + description: Signal (e.g. "SIGTERM") used to gracefully terminate the model server. + pattern: '^SIG[A-Z]+$' type: string
2825-2837:logger.storageschema is underspecified — clarify required fieldsAt least one of
keyorpathis presumably mandatory for a logger target. Consider adding arequired: ["key"](or["path"]) array, plus descriptions for each property to make the CRD self-describing. Without this, invalid empty objects will pass validation and fail later at runtime.
21480-21482: ExposeclusterServingRuntimeNameviaprintColumnand add descriptionAdding
x-kubernetes-print-column: - name: ClusterRuntime type: string jsonPath: .status.clusterServingRuntimeNamemakes this useful in
kubectl getlistings. Also document the field with adescriptionto clarify its intent.pkg/webhook/admission/pod/agent_injector.go (1)
209-223: Consider refactoring for improved readability.The nested nil checks are correct but could be simplified for better readability.
Consider this refactor:
- storagePath := "" - if ag.loggerConfig.Store != nil { - if ag.loggerConfig.Store.Path != nil { - storagePath = *ag.loggerConfig.Store.Path - } - } - storageFormat := "" - if ag.loggerConfig.Store != nil { - if ag.loggerConfig.Store.Parameters != nil { - format, ok := (*ag.loggerConfig.Store.Parameters)[constants.LoggerFormatKey] - if ok { - storageFormat = format - } - } - } + storagePath := "" + storageFormat := "" + if ag.loggerConfig.Store != nil { + if ag.loggerConfig.Store.Path != nil { + storagePath = *ag.loggerConfig.Store.Path + } + if ag.loggerConfig.Store.Parameters != nil { + if format, ok := (*ag.loggerConfig.Store.Parameters)[constants.LoggerFormatKey]; ok { + storageFormat = format + } + } + }pkg/openapi/swagger.json (2)
3306-3310: Potential naming ambiguity betweenclusterServingRuntimeNameandservingRuntimeNameBoth fields are plain strings with no validation hints (enum/regex). Because their semantic difference is subtle (cluster-scoped vs namespaced runtime), it will be easy for users and tooling to mix them up.
Consider:
- adding
x-kubernetes-validations(regex) to enforce qualified names (e.g.<ns>/<name>vs just<name>), or- embedding them in a single object (
runtimeRef { name, namespace?, clusterScoped? }).This will avoid divergent conventions across clients and SDKs.
Also applies to: 3342-3345
3681-3714: Minor:serviceAccountNameinLoggerStorageSpeclacks descriptionEvery other field has a helpful
description. Add one for consistency (e.g. “K8s ServiceAccount to access the backing object store when different from the pod’s default SA”).pkg/logger/store.go (2)
52-69: Consider case-insensitive format comparison.The marshaller implementation is clean. However, consider making the format comparison case-insensitive for better usability.
func getMarshaller(format string) (Marshaller, error) { - switch format { + switch strings.ToLower(format) { case "json": return &JSONMarshaller{}, nil default: return nil, fmt.Errorf("unsupported format %s", format) } }
119-153: Enhance success logging with context.The upload logic is solid, but the success log could be more informative.
- s.log.Info("Successfully uploaded object to S3") + s.log.Infof("Successfully uploaded log to S3: bucket=%s, key=%s", bucket, objectKey)pkg/apis/serving/v1alpha1/llm_inference_service_types.go (1)
149-210: Consider documenting the design asymmetry.The RouterSpec design where HTTPRoute can be either managed (via Spec) or referenced (via Refs), while Gateway and Ingress only support references, is architecturally sound. However, consider adding a comment explaining this design choice for clarity.
Add a comment before GatewaySpec:
// GatewaySpec defines the configuration for a Gateway API Gateway. // Unlike HTTPRouteSpec, Gateway resources are always user-managed (never created by the controller) // to avoid conflicts with cluster-wide Gateway controllers. type GatewaySpec struct {
There was a problem hiding this comment.
Actionable comments posted: 7
♻️ Duplicate comments (18)
config/crd/full/serving.kserve.io_inferenceservices.yaml (18)
1601-1602: Same observation as above – please add adescriptionand (optionally) a simple validation.
2357-2358: Same observation as above – please add adescriptionand (optionally) a simple validation.
7374-7386: Samelogger.storageremarks as above – descriptions &requiredlist are missing.
7629-7630: SamestopSignalremarks as above – please add description & validation.
8336-8337: SamestopSignalremarks as above – please add description & validation.
9038-9039: SamestopSignalremarks as above – please add description & validation.
9727-9728: SamestopSignalremarks as above – please add description & validation.
10423-10424: SamestopSignalremarks as above – please add description & validation.
11291-11292: SamestopSignalremarks as above – please add description & validation.
11982-11983: SamestopSignalremarks as above – please add description & validation.
12751-12752: SamestopSignalremarks as above – please add description & validation.
14668-14669: SamestopSignalremarks as above – please add description & validation.
15377-15378: SamestopSignalremarks as above – please add description & validation.
16094-16095: SamestopSignalremarks as above – please add description & validation.
17832-17833: SamestopSignalremarks as above – please add description & validation.
19144-19145: SamestopSignalremarks as above – please add description & validation.
19900-19901: SamestopSignalremarks as above – please add description & validation.
20368-20380: Samelogger.storageremarks as above – descriptions &requiredlist are missing.
🧹 Nitpick comments (26)
python/kserve/docs/V1beta1PredictorExtensionSpec.md (1)
27-27: Add a short description for the newstoragefieldThe type reference was updated correctly, but the description column was left blank. A one–liner helps users understand why the new
ModelStorageSpecmatters.-**storage** | [**V1beta1ModelStorageSpec**](V1beta1ModelStorageSpec.md) | | [optional] +**storage** | [**V1beta1ModelStorageSpec**](V1beta1ModelStorageSpec.md) | Configuration for locating the model artifact (URI, access mode, credentials, etc.). | [optional]python/kserve/docs/V1beta1ARTExplainerSpec.md (1)
27-27: Fill in the missing description forstorageReplicating the blank description leaves readers guessing. Consider adding a concise explanation, e.g.:
-**storage** | [**V1beta1ModelStorageSpec**](V1beta1ModelStorageSpec.md) | | [optional] +**storage** | [**V1beta1ModelStorageSpec**](V1beta1ModelStorageSpec.md) | Where the explainer’s model artifacts are stored and how to retrieve them. | [optional]python/kserve/docs/V1beta1ONNXRuntimeSpec.md (1)
27-27: Clarify the purpose ofstorageSame issue as the other specs: the cell is empty.
-**storage** | [**V1beta1ModelStorageSpec**](V1beta1ModelStorageSpec.md) | | [optional] +**storage** | [**V1beta1ModelStorageSpec**](V1beta1ModelStorageSpec.md) | Detailed spec for downloading the ONNX model (URI, schema path, auth). | [optional]python/kserve/docs/V1beta1PaddleServerSpec.md (1)
26-26: Populate the empty description columnConsistency with the other spec tables improves usability.
-**storage** | [**V1beta1ModelStorageSpec**](V1beta1ModelStorageSpec.md) | | [optional] +**storage** | [**V1beta1ModelStorageSpec**](V1beta1ModelStorageSpec.md) | Location and credentials for Paddle model artifacts. | [optional]pkg/agent/storage/gcs.go (1)
64-66: Placeholder implementation for GCS upload functionality.The method correctly implements the new
UploadObjectinterface requirement with an appropriate error message indicating the functionality is not yet implemented.Consider creating a tracking issue for implementing GCS upload functionality to complete the storage provider interface.
python/kserve/docs/V1beta1LoggerSpec.md (1)
10-10: Fix markdown table formatting issues.The addition of the
storageproperty is correct and aligns with the logger storage enhancement. However, the markdown table has formatting issues that need to be addressed.Apply this fix to correct the table formatting:
-**storage** | [**V1beta1LoggerStorageSpec**](V1beta1LoggerStorageSpec.md) | | [optional] +**storage** | [**V1beta1LoggerStorageSpec**](V1beta1LoggerStorageSpec.md) | | [optional]The table should have consistent column count and no trailing pipes.
pkg/client/clientset/versioned/fake/clientset_generated.go (1)
55-61: Good enhancement with a typo that needs fixing.The enhancement to extract and pass ListOptions to the watch reactor is a valuable improvement that enables proper filtering and option handling in the fake clientset. However, there's a typo on line 56.
Fix the typo in the variable name:
- if watchActcion, ok := action.(testing.WatchActionImpl); ok { + if watchAction, ok := action.(testing.WatchActionImpl); ok {python/kserve/docs/V1alpha1LLMInferenceService.md (1)
1-16: Fix markdown formatting issues while maintaining good content structure.The documentation content is well-structured and appropriately describes the LLM inference service resource. However, there are some markdown formatting issues that should be addressed:
+ Name | Type | Description | Notes ------------ | ------------- | ------------- | ------------- -**api_version** | **str** | APIVersion defines the versioned schema of this representation of an object. Servers should convert recognized schemas to the latest internal value, and may reject unrecognized values. More info: https://git.k8s.io/community/contributors/devel/sig-architecture/api-conventions.md#resources | [optional] -**kind** | **str** | Kind is a string value representing the REST resource this object represents. Servers may infer this from the endpoint the client submits requests to. Cannot be updated. In CamelCase. More info: https://git.k8s.io/community/contributors/devel/sig-architecture/api-conventions.md#types-kinds | [optional] +**api_version** | **str** | APIVersion defines the versioned schema of this representation of an object. Servers should convert recognized schemas to the latest internal value, and may reject unrecognized values. More info: <https://git.k8s.io/community/contributors/devel/sig-architecture/api-conventions.md#resources> | [optional] +**kind** | **str** | Kind is a string value representing the REST resource this object represents. Servers may infer this from the endpoint the client submits requests to. Cannot be updated. In CamelCase. More info: <https://git.k8s.io/community/contributors/devel/sig-architecture/api-conventions.md#types-kinds> | [optional] **metadata** | [**V1ObjectMeta**](https://github.com/kubernetes-client/python/blob/master/kubernetes/docs/V1ObjectMeta.md) | | [optional] **spec** | [**V1alpha1LLMInferenceServiceSpec**](V1alpha1LLMInferenceServiceSpec.md) | | [optional] **status** | [**V1alpha1LLMInferenceServiceStatus**](V1alpha1LLMInferenceServiceStatus.md) | | [optional] +python/kserve/docs/V1beta1ModelStorageSpec.md (1)
1-14: Fix markdown table formatting for consistency.The documentation content accurately describes the ModelStorageSpec properties. However, there's a minor formatting issue with the table spacing.
# V1beta1ModelStorageSpec + ## Properties Name | Type | Description | Notes ------------ | ------------- | ------------- | ------------- **key** | **str** | The Storage Key in the secret for this object. | [optional] **parameters** | **dict(str, str)** | Parameters to override the default storage credentials and config. | [optional] **path** | **str** | The path to the object in the storage. Note that this path is relative to the storage URI. | [optional] **schema_path** | **str** | The path to the model schema file in the storage. | [optional] +python/kserve/docs/V1alpha1LLMInferenceServiceList.md (1)
4-10: Fix table formatting for better markdown rendering.The table should be surrounded by blank lines to comply with markdown standards and ensure proper rendering.
# V1alpha1LLMInferenceServiceList LLMInferenceServiceList is the list type for LLMInferenceService. + ## Properties Name | Type | Description | Notes ------------ | ------------- | ------------- | ------------- **api_version** | **str** | APIVersion defines the versioned schema of this representation of an object. Servers should convert recognized schemas to the latest internal value, and may reject unrecognized values. More info: https://git.k8s.io/community/contributors/devel/sig-architecture/api-conventions.md#resources | [optional] **items** | [**list[V1alpha1LLMInferenceService]**](V1alpha1LLMInferenceService.md) | | **kind** | **str** | Kind is a string value representing the REST resource this object represents. Servers may infer this from the endpoint the client submits requests to. Cannot be updated. In CamelCase. More info: https://git.k8s.io/community/contributors/devel/sig-architecture/api-conventions.md#types-kinds | [optional] **metadata** | [**V1ListMeta**](https://github.com/kubernetes-client/python/blob/master/kubernetes/docs/V1ListMeta.md) | | [optional] +python/kserve/docs/V1beta1LoggerStorageSpec.md (1)
3-10: Fix table formatting issues for proper markdown rendering.The table has multiple formatting problems that need to be addressed:
- Missing blank lines around the table
- Trailing pipe without proper column data on line 10
- Missing description for the
service_account_namefield# V1beta1LoggerStorageSpec + ## Properties Name | Type | Description | Notes ------------ | ------------- | ------------- | ------------- **key** | **str** | The Storage Key in the secret for this object. | [optional] **parameters** | **dict(str, str)** | Parameters to override the default storage credentials and config. | [optional] **path** | **str** | The path to the object in the storage. Note that this path is relative to the storage URI. | [optional] -**service_account_name** | **str** | | [optional] +**service_account_name** | **str** | The service account name for storage access. | [optional] +python/kserve/docs/V1alpha1LLMInferenceServiceConfig.md (1)
4-10: Add blank lines around the table for better markdown compliance.The table should be surrounded by blank lines according to markdown best practices.
LLMInferenceServiceConfig is the Schema for the llminferenceserviceconfigs API. It acts as a template to provide base configurations that can be inherited by multiple LLMInferenceService instances. + ## Properties Name | Type | Description | Notes ------------ | ------------- | ------------- | ------------- **api_version** | **str** | APIVersion defines the versioned schema of this representation of an object. Servers should convert recognized schemas to the latest internal value, and may reject unrecognized values. More info: https://git.k8s.io/community/contributors/devel/sig-architecture/api-conventions.md#resources | [optional] **kind** | **str** | Kind is a string value representing the REST resource this object represents. Servers may infer this from the endpoint the client submits requests to. Cannot be updated. In CamelCase. More info: https://git.k8s.io/community/contributors/devel/sig-architecture/api-conventions.md#types-kinds | [optional] **metadata** | [**V1ObjectMeta**](https://github.com/kubernetes-client/python/blob/master/kubernetes/docs/V1ObjectMeta.md) | | [optional] **spec** | [**V1alpha1LLMInferenceServiceSpec**](V1alpha1LLMInferenceServiceSpec.md) | | [optional] +python/kserve/docs/V1beta1StorageSpec.md (1)
4-10: Add blank lines around the table for better markdown compliance.The table should be surrounded by blank lines according to markdown best practices.
StorageSpec defines a spec for an object in an object store + ## Properties Name | Type | Description | Notes ------------ | ------------- | ------------- | ------------- **key** | **str** | The Storage Key in the secret for this object. | [optional] **parameters** | **dict(str, str)** | Parameters to override the default storage credentials and config. | [optional] **path** | **str** | The path to the object in the storage. Note that this path is relative to the storage URI. | [optional] +python/kserve/kserve/models/v1beta1_logger_storage_spec.py (1)
153-172: Missing detailed documentation for service_account_name property.The
service_account_nameproperty getter and setter lack detailed docstrings compared to other properties (likekey,parameters,path). For consistency and better API documentation, consider adding a description similar to the other properties.@property def service_account_name(self): """Gets the service_account_name of this V1beta1LoggerStorageSpec. # noqa: E501 + The name of the service account to use for accessing the storage. # noqa: E501 :return: The service_account_name of this V1beta1LoggerStorageSpec. # noqa: E501 :rtype: str """ return self._service_account_name @service_account_name.setter def service_account_name(self, service_account_name): """Sets the service_account_name of this V1beta1LoggerStorageSpec. + The name of the service account to use for accessing the storage. # noqa: E501 :param service_account_name: The service_account_name of this V1beta1LoggerStorageSpec. # noqa: E501 :type: str """config/crd/full/serving.kserve.io_inferenceservices.yaml (4)
738-739: Add description & basic validation for the newstopSignalfieldThe UI/CLI will surface this field without any context. Add at least a
descriptionand (optionally) anenum/x-kubernetes-validationsrule so users know what values such asSIGTERM,SIGKILL, etc. are accepted.stopSignal: + description: Signal that the control-plane sends to gracefully stop the + container (e.g. "SIGTERM"). If omitted the default Docker stop signal + is used. type: string +# x-kubernetes-validations: +# - rule: self =~ '^SIG[A-Z]+$' +# message: stopSignal must be a POSIX signal name, e.g. SIGTERM
2825-2837:logger.storageobject lacks descriptions & required markersThe new
storageblock is self-explanatory only for maintainers. Adddescriptionfields and declare which keys are required (at minimumkeyorpathlooks mandatory).storage: properties: key: + description: Identifier of the storage backend (e.g. "s3", "gcs"). type: string parameters: additionalProperties: type: string type: object + description: Arbitrary backend-specific configuration parameters. path: + description: Absolute path or URI prefix inside the backend where + request/response logs are written. type: string serviceAccountName: + description: Service account used by the logger sidecar to access + the storage backend. type: string type: object +# required: +# - key
21480-21481: Add description forclusterServingRuntimeNameWithout a description it shows up as an opaque string in generated docs/CRD explain output.
clusterServingRuntimeName: + description: Name of the cluster-level ServingRuntime resource backing + this InferenceService, when applicable. type: string
21628-21629: Add description forservingRuntimeNameSame documentation concern as above.
servingRuntimeName: + description: Name of the namespace-scoped ServingRuntime resource that + instantiated this InferenceService. type: stringpkg/logger/mock.go (1)
26-43: Consider using a buffered channel to prevent blockingThe
ResponseChanis unbuffered, which means theStoremethod will block if there's no goroutine ready to receive from the channel. This could cause issues in tests if the receiver isn't set up before callingStore.func NewMockStore(storageSpec *v1beta1.StorageSpec) *MockStore { return &MockStore{ StorageSpec: storageSpec, - ResponseChan: make(chan *LogRequest), + ResponseChan: make(chan *LogRequest, 1), } }pkg/webhook/admission/pod/agent_injector.go (1)
209-223: Simplify redundant nil checksThe code has redundant checks for
ag.loggerConfig.Store. Consider combining them for better readability.- storagePath := "" - if ag.loggerConfig.Store != nil { - if ag.loggerConfig.Store.Path != nil { - storagePath = *ag.loggerConfig.Store.Path - } - } - storageFormat := "" - if ag.loggerConfig.Store != nil { - if ag.loggerConfig.Store.Parameters != nil { - format, ok := (*ag.loggerConfig.Store.Parameters)[constants.LoggerFormatKey] - if ok { - storageFormat = format - } - } - } + storagePath := "" + storageFormat := "" + if ag.loggerConfig.Store != nil { + if ag.loggerConfig.Store.Path != nil { + storagePath = *ag.loggerConfig.Store.Path + } + if ag.loggerConfig.Store.Parameters != nil { + if format, ok := (*ag.loggerConfig.Store.Parameters)[constants.LoggerFormatKey]; ok { + storageFormat = format + } + } + }pkg/openapi/swagger.json (2)
3306-3309: New runtime-name fields: addomitempty& RBAC-safe exposureThe newly introduced
clusterServingRuntimeNameandservingRuntimeNamefields are optional, but the schema lacksx-kubernetes-list-type/nullable/omitemptyhints.
Withoutomitempty, server-side apply can introduce superfluous empty strings, cluttering status patches.Likewise, exposing cluster-scoped runtime names inside namespace-scoped resources can leak information across tenants; confirm that the controller cleanses these fields when multi-tenant RBAC is in effect.
Also applies to: 3342-3345
3681-3714:serviceAccountNameproperty undocumented
v1beta1.LoggerStorageSpec.serviceAccountNamelacks a description unlike its peers. Add a short docstring so the generated client & docs are self-explanatory."serviceAccountName": { - "type": "string" + "description": "Kubernetes ServiceAccount used by the logger side-car (overrides default SA).", + "type": "string" }pkg/logger/store.go (3)
38-50: Consider renaming the parameter for clarityThe parameter is named
urlbut it's actually a string that hasn't been parsed yet. Consider renaming it tourlStrorrawURLto make this clearer.-func GetStorageStrategy(url string) StorageStrategy { +func GetStorageStrategy(urlStr string) StorageStrategy { // http, https switch { - case strings.HasPrefix(url, "http"): // http, https + case strings.HasPrefix(urlStr, "http"): // http, https return HttpStorage - case strings.HasPrefix(url, "s3"): // s3, s3a + case strings.HasPrefix(urlStr, "s3"): // s3, s3a return S3Storage default: return DefaultStorage } }
104-108: Consider a cleaner approach for protocol conversionThe current approach of appending "://" to convert scheme to Protocol is a workaround. Consider creating a dedicated mapping function or using a cleaner conversion approach.
- // Convert to a Protocol to reuse existing types - if !strings.HasSuffix(scheme, "://") { - scheme += "://" - } - protocol := storage.Protocol(scheme) + // Map scheme to Protocol + protocol := storage.Protocol(scheme + "://")
151-151: Enhance the success log message with more contextThe success message could be more informative by including details about what was uploaded.
- s.log.Info("Successfully uploaded object to S3") + s.log.Infof("Successfully uploaded log to S3: bucket=%s, key=%s", bucket, objectKey)pkg/apis/serving/v1alpha1/llm_inference_service_types.go (1)
149-172: Consider adding validation for mutually exclusive fieldsThe comment states that Route/Gateway and Ingress fields are mutually exclusive, but this isn't enforced at the type level. Consider adding OpenAPI schema validation or webhook validation to ensure only one routing option is specified.
9f4edf6 to
55359b2
Compare
chore: required by "sigs.k8s.io/gateway-api-inference-extension/api/v1alpha2" -> llm_inference_service_types.go Signed-off-by: Spolti <fspolti@redhat.com>
55359b2 to
cd65275
Compare
There was a problem hiding this comment.
Actionable comments posted: 1
🔭 Outside diff range comments (1)
Makefile (1)
142-150: Consider-prunefor speed & clarity + cover plainvenvdirs
You are already guarding against several virtual-env folders, but:
- A plain
venv/(without the dot) is still common and will be scanned.- Using a chain of
-not -path …keeps traversing the unwanted trees;-pruneprevents the descent and is noticeably faster on large repos.Example refactor (keeps the existing semantics):
-for file in $$(find . -type f -name "pyproject.toml" \ - -not -path "./pyproject.toml" \ - -not -path "*.venv/*" \ - -not -path "*/venv/*" \ - -not -path "*/prod_venv/*" \ - -not -path "*/site-packages/*" \ - -not -path "*/__pypackages__/*" \ - -not -path "*/build/*"); do \ +for file in $$(find . \( -path "*/.venv" -o -path "*/venv" -o -path "*/prod_venv" \ + -o -path "*/site-packages" -o -path "*/__pypackages__" -o -path "*/build" \) -prune \ + -o -name "pyproject.toml" -not -path "./pyproject.toml" -print); do \This keeps the intent while improving performance and coverage.
📜 Review details
Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro
⛔ Files ignored due to path filters (1)
go.sumis excluded by!**/*.sum
📒 Files selected for processing (27)
.flake8(1 hunks)Makefile(1 hunks)go.mod(8 hunks)pkg/apis/serving/v1alpha1/inference_graph_validation_test.go(3 hunks)pkg/apis/serving/v1alpha1/trainedmodel_webhook_test.go(3 hunks)pkg/apis/serving/v1beta1/configmap_test.go(12 hunks)pkg/apis/serving/v1beta1/inference_service_validation_test.go(22 hunks)pkg/apis/serving/v1beta1/predictor_model_test.go(1 hunks)pkg/controller/v1alpha1/trainedmodel/reconcilers/modelconfig/modelconfig_reconciler_test.go(6 hunks)pkg/controller/v1beta1/inferenceservice/components/explainer.go(1 hunks)pkg/controller/v1beta1/inferenceservice/components/transformer.go(2 hunks)pkg/controller/v1beta1/inferenceservice/reconcilers/autoscaler/autoscaler_reconciler_test.go(1 hunks)pkg/controller/v1beta1/inferenceservice/reconcilers/cabundleconfigmap/cabundle_configmap_reconciler_test.go(10 hunks)pkg/controller/v1beta1/inferenceservice/reconcilers/deployment/deployment_reconciler_test.go(2 hunks)pkg/controller/v1beta1/inferenceservice/reconcilers/hpa/hpa_reconciler_test.go(4 hunks)pkg/controller/v1beta1/inferenceservice/reconcilers/ingress/httproute_reconciler_test.go(23 hunks)pkg/controller/v1beta1/inferenceservice/reconcilers/ingress/ingress_reconciler_test.go(1 hunks)pkg/controller/v1beta1/inferenceservice/reconcilers/keda/keda_reconciler_test.go(4 hunks)pkg/controller/v1beta1/inferenceservice/reconcilers/knative/ksvc_reconciler_test.go(3 hunks)pkg/controller/v1beta1/inferenceservice/reconcilers/otel/otel_reconciler_test.go(2 hunks)pkg/controller/v1beta1/inferenceservice/utils/utils_test.go(6 hunks)pkg/credentials/service_account_credentials_test.go(18 hunks)pkg/openapi/openapi_generated.go(24 hunks)pkg/webhook/admission/localmodelcache/local_model_cache_validation_test.go(5 hunks)pkg/webhook/admission/pod/mutator_test.go(2 hunks)pkg/webhook/admission/pod/storage_initializer_injector_test.go(7 hunks)pkg/webhook/admission/servingruntime/servingruntime_webhook_test.go(1 hunks)
✅ Files skipped from review due to trivial changes (11)
- pkg/controller/v1beta1/inferenceservice/reconcilers/autoscaler/autoscaler_reconciler_test.go
- .flake8
- pkg/controller/v1alpha1/trainedmodel/reconcilers/modelconfig/modelconfig_reconciler_test.go
- pkg/controller/v1beta1/inferenceservice/reconcilers/hpa/hpa_reconciler_test.go
- pkg/controller/v1beta1/inferenceservice/reconcilers/deployment/deployment_reconciler_test.go
- pkg/controller/v1beta1/inferenceservice/reconcilers/cabundleconfigmap/cabundle_configmap_reconciler_test.go
- pkg/apis/serving/v1beta1/configmap_test.go
- pkg/controller/v1beta1/inferenceservice/reconcilers/keda/keda_reconciler_test.go
- pkg/apis/serving/v1beta1/inference_service_validation_test.go
- pkg/webhook/admission/pod/storage_initializer_injector_test.go
- pkg/controller/v1beta1/inferenceservice/reconcilers/ingress/httproute_reconciler_test.go
🚧 Files skipped from review as they are similar to previous changes (5)
- pkg/controller/v1beta1/inferenceservice/components/explainer.go
- pkg/controller/v1beta1/inferenceservice/components/transformer.go
- pkg/controller/v1beta1/inferenceservice/utils/utils_test.go
- pkg/credentials/service_account_credentials_test.go
- go.mod
🧰 Additional context used
🧠 Learnings (6)
pkg/controller/v1beta1/inferenceservice/reconcilers/ingress/ingress_reconciler_test.go (1)
Learnt from: israel-hdez
PR: opendatahub-io/kserve#738
File: pkg/apis/serving/v1beta1/configmap.go:0-0
Timestamp: 2025-07-17T16:19:36.771Z
Learning: In pkg/apis/serving/v1beta1/configmap.go, the GetStorageInitializerConfigs function was moved from another package to prevent circular imports, which explains why it may contain existing patterns rather than being newly written code.
pkg/apis/serving/v1beta1/predictor_model_test.go (1)
Learnt from: israel-hdez
PR: opendatahub-io/kserve#738
File: pkg/apis/serving/v1beta1/configmap.go:0-0
Timestamp: 2025-07-17T16:19:36.771Z
Learning: In pkg/apis/serving/v1beta1/configmap.go, the GetStorageInitializerConfigs function was moved from another package to prevent circular imports, which explains why it may contain existing patterns rather than being newly written code.
pkg/webhook/admission/pod/mutator_test.go (1)
Learnt from: israel-hdez
PR: opendatahub-io/kserve#738
File: pkg/apis/serving/v1beta1/configmap.go:0-0
Timestamp: 2025-07-17T16:19:36.771Z
Learning: In pkg/apis/serving/v1beta1/configmap.go, the GetStorageInitializerConfigs function was moved from another package to prevent circular imports, which explains why it may contain existing patterns rather than being newly written code.
pkg/apis/serving/v1alpha1/inference_graph_validation_test.go (1)
Learnt from: israel-hdez
PR: opendatahub-io/kserve#738
File: pkg/apis/serving/v1beta1/configmap.go:0-0
Timestamp: 2025-07-17T16:19:36.771Z
Learning: In pkg/apis/serving/v1beta1/configmap.go, the GetStorageInitializerConfigs function was moved from another package to prevent circular imports, which explains why it may contain existing patterns rather than being newly written code.
pkg/apis/serving/v1alpha1/trainedmodel_webhook_test.go (1)
Learnt from: israel-hdez
PR: opendatahub-io/kserve#738
File: pkg/apis/serving/v1beta1/configmap.go:0-0
Timestamp: 2025-07-17T16:19:36.771Z
Learning: In pkg/apis/serving/v1beta1/configmap.go, the GetStorageInitializerConfigs function was moved from another package to prevent circular imports, which explains why it may contain existing patterns rather than being newly written code.
pkg/openapi/openapi_generated.go (2)
undefined
<retrieved_learning>
Learnt from: israel-hdez
PR: #738
File: pkg/apis/serving/v1beta1/configmap.go:0-0
Timestamp: 2025-07-17T16:19:36.771Z
Learning: In pkg/apis/serving/v1beta1/configmap.go, the GetStorageInitializerConfigs function was moved from another package to prevent circular imports, which explains why it may contain existing patterns rather than being newly written code.
</retrieved_learning>
<retrieved_learning>
Learnt from: israel-hdez
PR: #738
File: pkg/types/config.go:30-30
Timestamp: 2025-07-17T16:20:08.917Z
Learning: In pkg/types/config.go, the StorageInitializerConfig struct field EnableOciImageSource uses the JSON tag "enableModelcar" (instead of matching the field name) to maintain backwards compatibility after the struct was moved from another package. The JSON tag cannot be changed to avoid breaking existing configurations.
</retrieved_learning>
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (5)
- GitHub Check: precommit-check
- GitHub Check: test
- GitHub Check: test
- GitHub Check: build (3.11)
- GitHub Check: test
🔇 Additional comments (17)
pkg/controller/v1beta1/inferenceservice/reconcilers/ingress/ingress_reconciler_test.go (1)
2145-2145: LGTM: Good practice using test context.The change from
context.TODO()tot.Context()is a good improvement as it provides proper context propagation tied to the test's lifecycle, enabling better timeout handling and cancellation.pkg/webhook/admission/servingruntime/servingruntime_webhook_test.go (1)
1885-1885: LGTM! Improved context usage in test.This change properly uses the test's context instead of a generic TODO context, which improves context propagation and aligns with testing best practices.
pkg/apis/serving/v1beta1/predictor_model_test.go (1)
493-493: LGTM! Consistent context usage improvement.This change properly uses the test's context instead of a background context, which is consistent with the broader pattern across the codebase and improves context management in tests.
pkg/webhook/admission/pod/mutator_test.go (1)
53-53: LGTM! Comprehensive context usage improvements.These changes consistently replace
context.Background()witht.Context()across all Kubernetes client operations and the mutator's Handle method, ensuring proper context propagation throughout the test lifecycle.Also applies to: 298-298, 306-306, 309-309
pkg/webhook/admission/localmodelcache/local_model_cache_validation_test.go (1)
108-108: LGTM! Consistent context improvements across validation tests.These changes properly replace
context.Background()andcontext.TODO()witht.Context()across all validator method calls (ValidateDelete, ValidateCreate, ValidateUpdate), ensuring consistent context propagation throughout the validation test suite.Also applies to: 124-124, 156-156, 174-174, 186-186
pkg/controller/v1beta1/inferenceservice/reconcilers/otel/otel_reconciler_test.go (1)
174-174: LGTM! Improved context usage in reconciler tests.These changes consistently replace
context.TODO()witht.Context()across both reconciler operations and client calls in the test functions, ensuring proper context propagation and alignment with testing best practices.Also applies to: 179-179, 256-256, 261-261
pkg/apis/serving/v1alpha1/inference_graph_validation_test.go (1)
273-273: LGTM! Consistent context usage improvement.The changes properly replace
context.Background()witht.Context()in all validation method calls, which ties the validation context to the test lifecycle. This enables proper timeout handling and cancellation, following Go testing best practices.Also applies to: 313-313, 351-351
pkg/controller/v1beta1/inferenceservice/reconcilers/knative/ksvc_reconciler_test.go (1)
351-351: LGTM! Improved context handling in controller tests.The changes consistently replace
context.TODO()witht.Context()in client operations and reconciler calls. This improvement ensures proper context lifecycle management and enables timeout handling in controller tests.Also applies to: 367-367, 378-378
pkg/apis/serving/v1alpha1/trainedmodel_webhook_test.go (1)
138-138: LGTM! Consistent context usage improvement in webhook tests.The changes properly replace
context.Background()witht.Context()in all validation method calls, ensuring that webhook validation operations are tied to the test lifecycle. This enables proper timeout handling and follows Go testing best practices.Also applies to: 263-263, 292-292
pkg/openapi/openapi_generated.go (8)
31-47: LGTM! New LLM type registrations follow established patterns.The additions of LLM-related OpenAPI definitions are consistent with the existing pattern and align with the introduction of new CRDs for LLM inference services.
860-1044: Schema definitions are well-structured and complete.The new LLM-related schemas follow Kubernetes API conventions correctly:
- Primary resource (LLMInferenceService) includes both spec and status
- Config resource (LLMInferenceServiceConfig) appropriately omits status
- List types are properly defined with required items field
- All dependencies are correctly specified
2912-2912: Documentation improvements enhance clarity.The init container description updates fix grammatical issues and improve readability.
Also applies to: 3402-3402, 3892-3892
2566-2566: Consistent refactoring of storage specifications across all model types.The systematic replacement of
StorageSpecwithModelStorageSpecacross all model-serving components is well-executed. This change provides better type specificity and aligns with the introduction of separate storage types for different purposes (models vs. logs).Also applies to: 4709-4709, 5788-5788, 6548-6548, 7224-7224, 7667-7667, 8071-8071, 8396-8396, 9263-9263, 10294-10294, 10702-10702, 11028-11028, 11921-11921, 12715-12715
6038-6051: Runtime tracking fields enhance service observability.The addition of
servingRuntimeNameandclusterServingRuntimeNamefields to the InferenceServiceStatus provides valuable information for tracking which runtime an InferenceService is using.
6668-6726: Logger storage configuration properly implemented.The addition of
LoggerStorageSpecand its integration intoLoggerSpecenables proper configuration of storage for inference logger cloud events. The schema structure is consistent with other storage specifications while appropriately omitting model-specific fields likeschemaPath.
7277-7324: ModelStorageSpec provides appropriate model-specific storage configuration.The new
ModelStorageSpectype extends the generic storage configuration with aschemaPathfield, which is valuable for model schema validation scenarios. This specialization justifies the refactoring from the genericStorageSpec.
10349-10349: Documentation updates improve accuracy.The description updates for
StorageSpecmake the documentation more precise by clarifying that paths are relative to the storage URI and using more generic terminology ("object" instead of specific types).Also applies to: 10354-10354, 10377-10377
There was a problem hiding this comment.
Actionable comments posted: 0
♻️ Duplicate comments (2)
pkg/controller/v1beta1/inferenceservice/controller.go (2)
446-476: Issues remain in commented code.The commented
clusterServingRuntimeFuncmethod still contains the same issues identified in previous reviews:
- Incorrect namespace filtering for cluster-scoped resources
- Missing filtering to only enqueue InferenceServices that reference the specific ClusterServingRuntime
While this code is currently commented out, these issues should be addressed before uncommenting.
502-516: Field path mismatch persists.The index field path
"spec.predictor.model.runtime"still doesn't match the returned values fromStatus.ServingRuntimeName. This is the same issue identified in previous reviews.
🧹 Nitpick comments (1)
pkg/controller/v1beta1/inferenceservice/controller.go (1)
446-446: Fix comment formatting.Add spaces after
//in comments for consistency with Go formatting standards.-//func (r *InferenceServiceReconciler) clusterServingRuntimeFunc(ctx context.Context, obj client.Object) []reconcile.Request { +// func (r *InferenceServiceReconciler) clusterServingRuntimeFunc(ctx context.Context, obj client.Object) []reconcile.Request {Apply similar formatting to other comment lines (510, 530, 598).
Also applies to: 510-510, 530-530, 598-598
📜 Review details
Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro
📒 Files selected for processing (1)
pkg/controller/v1beta1/inferenceservice/controller.go(7 hunks)
🧰 Additional context used
🧠 Learnings (1)
pkg/controller/v1beta1/inferenceservice/controller.go (3)
Learnt from: israel-hdez
PR: opendatahub-io/kserve#738
File: pkg/apis/serving/v1beta1/configmap.go:0-0
Timestamp: 2025-07-17T16:19:36.771Z
Learning: In pkg/apis/serving/v1beta1/configmap.go, the GetStorageInitializerConfigs function was moved from another package to prevent circular imports, which explains why it may contain existing patterns rather than being newly written code.
Learnt from: israel-hdez
PR: opendatahub-io/kserve#738
File: pkg/types/config.go:30-30
Timestamp: 2025-07-17T16:20:08.917Z
Learning: In pkg/types/config.go, the StorageInitializerConfig struct field EnableOciImageSource uses the JSON tag "enableModelcar" (instead of matching the field name) to maintain backwards compatibility after the struct was moved from another package. The JSON tag cannot be changed to avoid breaking existing configurations.
Learnt from: brettmthompson
PR: opendatahub-io/kserve#709
File: pkg/controller/v1alpha1/utils/utils.go:118-128
Timestamp: 2025-06-30T22:30:38.045Z
Learning: In pkg/controller/v1alpha1/utils/utils.go, the condition `minReplicas == nil && constants.DefaultMinReplicas == 0` in ValidateInitialScaleAnnotation function is intentionally kept even though currently unreachable (DefaultMinReplicas = 1) to handle potential future changes to DefaultMinReplicas value, implementing defensive programming to limit future code changes.
🪛 golangci-lint (1.64.8)
pkg/controller/v1beta1/inferenceservice/controller.go
446-446: commentFormatting: put a space between // and comment text
(gocritic)
510-510: commentFormatting: put a space between // and comment text
(gocritic)
530-530: commentFormatting: put a space between // and comment text
(gocritic)
598-598: commentFormatting: put a space between // and comment text
(gocritic)
🔇 Additional comments (5)
pkg/controller/v1beta1/inferenceservice/controller.go (5)
48-48: LGTM: Required import for new handler functionality.The handler import is correctly added to support the new
EnqueueRequestsFromMapFuncusage.
205-212: LGTM: Proper auto-update check implementation.The early return logic correctly prevents unnecessary reconciliations when auto-update is disabled and the service is ready. This optimization aligns with the similar check in the
servingRuntimeFuncmethod.
411-444: LGTM: Correct ServingRuntime change handling.The method correctly handles namespaced ServingRuntime resources by:
- Filtering InferenceServices by namespace
- Respecting the auto-update annotation
- Only enqueueing InferenceServices that reference the specific ServingRuntime
This implementation follows proper controller patterns.
518-527: LGTM: Appropriate predicate for spec changes.The predicate correctly filters to only trigger reconciliation on ServingRuntime spec changes, ignoring create/delete/generic events. This is an appropriate optimization.
597-599: LGTM: Correct watch setup for ServingRuntime.The watch configuration correctly connects the ServingRuntime resources with the mapping function and predicate. The commented ClusterServingRuntime watch is consistent with the commented implementation.
4dff74d to
b60968a
Compare
chore: added as part of kserve@a070e2c Signed-off-by: Spolti <fspolti@redhat.com>
b60968a to
efecad2
Compare
There was a problem hiding this comment.
Actionable comments posted: 1
♻️ Duplicate comments (2)
pkg/controller/v1beta1/inferenceservice/rawkube_controller_test.go (2)
4022-4022: Fix incorrect comment.The comment incorrectly states auto-update is disabled when it's actually enabled (annotation value is "false" which means disable auto-update is false, i.e., auto-update is enabled).
- // Define InferenceService with auto-update disabled. + // Define InferenceService with auto-update enabled.
4379-4379: Fix incorrect comment (duplicate issue).Same issue as above - the comment incorrectly states auto-update is disabled when it's actually enabled.
- // Define InferenceService with auto-update disabled. + // Define InferenceService with auto-update enabled.
📜 Review details
Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro
📒 Files selected for processing (3)
pkg/apis/serving/v1alpha1/servingruntime_types.go(1 hunks)pkg/controller/v1beta1/inferenceservice/controller.go(7 hunks)pkg/controller/v1beta1/inferenceservice/rawkube_controller_test.go(17 hunks)
🚧 Files skipped from review as they are similar to previous changes (1)
- pkg/controller/v1beta1/inferenceservice/controller.go
🧰 Additional context used
🧠 Learnings (2)
pkg/apis/serving/v1alpha1/servingruntime_types.go (2)
Learnt from: israel-hdez
PR: #738
File: pkg/apis/serving/v1beta1/configmap.go:0-0
Timestamp: 2025-07-17T16:19:36.771Z
Learning: In pkg/apis/serving/v1beta1/configmap.go, the GetStorageInitializerConfigs function was moved from another package to prevent circular imports, which explains why it may contain existing patterns rather than being newly written code.
Learnt from: brettmthompson
PR: #709
File: pkg/controller/v1alpha1/utils/utils.go:118-128
Timestamp: 2025-06-30T22:30:38.045Z
Learning: In pkg/controller/v1alpha1/utils/utils.go, the condition minReplicas == nil && constants.DefaultMinReplicas == 0 in ValidateInitialScaleAnnotation function is intentionally kept even though currently unreachable (DefaultMinReplicas = 1) to handle potential future changes to DefaultMinReplicas value, implementing defensive programming to limit future code changes.
pkg/controller/v1beta1/inferenceservice/rawkube_controller_test.go (3)
Learnt from: israel-hdez
PR: #738
File: pkg/apis/serving/v1beta1/configmap.go:0-0
Timestamp: 2025-07-17T16:19:36.771Z
Learning: In pkg/apis/serving/v1beta1/configmap.go, the GetStorageInitializerConfigs function was moved from another package to prevent circular imports, which explains why it may contain existing patterns rather than being newly written code.
Learnt from: israel-hdez
PR: #738
File: pkg/types/config.go:30-30
Timestamp: 2025-07-17T16:20:08.917Z
Learning: In pkg/types/config.go, the StorageInitializerConfig struct field EnableOciImageSource uses the JSON tag "enableModelcar" (instead of matching the field name) to maintain backwards compatibility after the struct was moved from another package. The JSON tag cannot be changed to avoid breaking existing configurations.
Learnt from: brettmthompson
PR: #709
File: pkg/controller/v1alpha1/utils/utils.go:118-128
Timestamp: 2025-06-30T22:30:38.045Z
Learning: In pkg/controller/v1alpha1/utils/utils.go, the condition minReplicas == nil && constants.DefaultMinReplicas == 0 in ValidateInitialScaleAnnotation function is intentionally kept even though currently unreachable (DefaultMinReplicas = 1) to handle potential future changes to DefaultMinReplicas value, implementing defensive programming to limit future code changes.
🔇 Additional comments (9)
pkg/apis/serving/v1alpha1/servingruntime_types.go (1)
284-285: Dependent Code Verified – No Active ClusterServingRuntime RegistrationsA project-wide search confirms that all controllers, webhooks, clients, informers, and tests referencing ClusterServingRuntime are commented out or removed from the active code paths. No unregistered types are used at runtime.
• SchemeBuilder only registers ServingRuntime types.
• Controllers/watchers for ClusterServingRuntime are entirely commented out.
• GetServingRuntime falls back exclusively to namespace-scoped ServingRuntime.
• Webhook handlers and informer/client code for ClusterServingRuntime remain disabled.
• Tests for ClusterServingRuntime are likewise commented out.No further action is needed here.
pkg/controller/v1beta1/inferenceservice/rawkube_controller_test.go (8)
63-63: Good addition of consistency timeout constant.The new
consistentlyTimeoutconstant is appropriately set to 5 seconds for consistency checks in the test suite.
587-587: Correctly updated status expectation.The addition of
ServingRuntimeNamefield to the expected status is consistent with the new functionality for tracking which serving runtime is being used.
1147-1147: Consistent status field updates.The addition of
ServingRuntimeNamefields across multiple test expectations correctly reflects the new status tracking functionality.Also applies to: 1703-1703
2211-2211: Good typo correction.Fixed the typo from "Kubernestes" to "Kubernetes" in the comment.
3805-3949: Well-structured test for auto-update behavior when annotation is missing.The test correctly verifies that when the auto-update annotation is not present, the deployment gets updated when the serving runtime changes. The test flow is logical and uses appropriate assertions.
4094-4253: Proper negative test case with consistency check.This test correctly verifies that when auto-update is disabled (
"true"value), the deployment labels remain unchanged even when the serving runtime is updated. The use ofConsistentlywithconsistentlyTimeoutis appropriate for this negative assertion.
4254-4484: Good test for selective reconciliation with multiple serving runtimes.The test properly validates that only the deployment associated with the updated serving runtime gets reconciled, while other deployments remain unchanged. This ensures the controller correctly maps serving runtime changes to the appropriate inference services.
4830-4830: Consistent ServingRuntimeName additions across all test expectations.All test cases have been properly updated to include the
ServingRuntimeNamefield in their expected status, maintaining consistency across the test suite.Also applies to: 5407-5407, 6251-6251, 7204-7204, 7896-7896, 8790-8790, 9835-9835, 10727-10727, 11249-11249
|
/test e2e-predictor |
|
/approve |
|
/lgtm /approve |
|
/lgtm |
|
[APPROVALNOTIFIER] This PR is APPROVED This pull-request has been approved by: mholder6, spolti, VedantMahabaleshwarkar The full list of commands accepted by this bot can be found here. The pull request process is described here DetailsNeeds approval from an approver in each of these files:
Approvers can indicate their approval by writing |
36f1080
into
opendatahub-io:master
What this PR does / why we need it:
Which issue(s) this PR fixes (optional, in
fixes #<issue number>(, fixes #<issue_number>, ...)format, will close the issue(s) when PR gets merged):Fixes #
Type of changes
Please delete options that are not relevant.
Feature/Issue validation/testing:
Please describe the tests that you ran to verify your changes and relevant result summary. Provide instructions so it can be reproduced.
Please also list any relevant details for your test configuration.
Test A
Test B
Logs
Special notes for your reviewer:
Checklist:
Release note:
Re-running failed tests
/rerun-all- rerun all failed workflows./rerun-workflow <workflow name>- rerun a specific failed workflow. Only one workflow name can be specified. Multiple /rerun-workflow commands are allowed per comment.Summary by CodeRabbit