Skip to content

250731 sync kserve/master into odh/master batch1#805

Merged
openshift-merge-bot[bot] merged 10 commits into
opendatahub-io:masterfrom
mholder6:250731_sync_upstream_batch1
Jul 31, 2025
Merged

250731 sync kserve/master into odh/master batch1#805
openshift-merge-bot[bot] merged 10 commits into
opendatahub-io:masterfrom
mholder6:250731_sync_upstream_batch1

Conversation

@mholder6
Copy link
Copy Markdown

@mholder6 mholder6 commented Jul 31, 2025

What this PR does / why we need it:
Synced the kserve/master branch into odh/master branch [batch1]
[RHOAIENG-29333]

Summary by CodeRabbit

  • New Features

    • Introduced new controller, configuration, and Helm charts for LLMInferenceService, enabling deployment and management of large language model inference services via KServe.
    • Added dynamic resource configuration for storage components in Helm charts, allowing user customization of CPU and memory requests/limits.
    • Added context-based predictor configuration for Python model servers, supporting flexible protocol and SSL settings.
  • Bug Fixes

    • Improved CA bundle handling for S3 storage downloads, ensuring proper SSL verification based on environment variables.
  • Documentation

    • Added and updated Helm chart READMEs to document new configuration options and installation instructions.
  • Tests

    • Expanded and updated test suites for configuration merging, CA bundle handling, and context-based predictor configuration.
  • Chores

    • Updated dependencies, refactored import aliases, and improved test environment setup for robustness and maintainability.

sivanantha321 and others added 10 commits June 28, 2025 11:45
Signed-off-by: Sivanantham Chinnaiyan <sivanantham.chinnaiyan@ideas2it.com>
Signed-off-by: Dan Sun <dsun20@bloomberg.net>
Co-authored-by: Dan Sun <dsun20@bloomberg.net>
Signed-off-by: Sivanantham Chinnaiyan <sivanantham.chinnaiyan@ideas2it.com>
Signed-off-by: tohana <tohana@akamai.com>
Co-authored-by: Sivanantham <90966311+sivanantha321@users.noreply.github.com>
Signed-off-by: Bartosz Majsak <bartosz.majsak@gmail.com>
Signed-off-by: Sivanantham Chinnaiyan <sivanantham.chinnaiyan@ideas2it.com>
Signed-off-by: Vedant Mahabaleshwarkar <vmahabal@redhat.com>
Co-authored-by: Bartosz Majsak <bartosz.majsak@gmail.com>
Co-authored-by: Pierangelo Di Pilato <pierdipi@redhat.com>
Signed-off-by: Fabien Dupont <fdupont@redhat.com>
Signed-off-by: Bartosz Majsak <bartosz.majsak@gmail.com>
Co-authored-by: Jooho Lee <jlee@redhat.com>
Signed-off-by: Andrews Arokiam <andrews.arokiam@ideas2it.com>
@coderabbitai
Copy link
Copy Markdown

coderabbitai Bot commented Jul 31, 2025

Walkthrough

This update introduces a new LLMInferenceService controller and associated Helm charts for KServe, including extensive configuration merging and templating logic, new Dockerfile, and RBAC/CRD integration. It standardizes import aliases for Gateway API, improves test CRD path resolution, and refactors the Python predictor configuration to use a shared context. Additional enhancements include CA bundle handling for S3 storage and expanded test coverage.

Changes

Cohort / File(s) Change Summary
LLMInferenceService Controller & Config Merge
cmd/llmisvc/main.go, llmisvc-controller.Dockerfile, pkg/controller/v1alpha1/llmisvc/controller.go, pkg/controller/v1alpha1/llmisvc/controller_types.go, pkg/controller/v1alpha1/llmisvc/config_merge.go, pkg/controller/v1alpha1/llmisvc/config_merge_test.go
Introduces new LLMInferenceService controller binary, Dockerfile, RBAC, and config merge/templating logic with comprehensive tests. Adds exported controller struct, config struct, and merging functions.
Helm Charts for llmisvc
charts/llmisvc-crd/Chart.yaml, charts/llmisvc-crd/README.md, charts/llmisvc-crd/README.md.gotmpl, charts/llmisvc-resources/Chart.yaml, charts/llmisvc-resources/README.md, charts/llmisvc-resources/README.md.gotmpl, charts/llmisvc-resources/templates/_helpers.tpl, charts/llmisvc-resources/templates/clusterrole.yaml, charts/llmisvc-resources/templates/deployment.yaml, charts/llmisvc-resources/templates/service.yaml, charts/llmisvc-resources/templates/serviceaccount.yaml, charts/llmisvc-resources/values.schema.json, charts/llmisvc-resources/values.yaml
Adds new Helm charts and templates for deploying llmisvc controller and CRDs, including helper templates, RBAC, deployment, service, service account, schema, and documentation.
Makefile & Manifest Integration
Makefile, config/rbac/llmisvc/role.yaml
Extends Makefile to generate/copy llmisvc RBAC and CRD manifests; adds new ClusterRole YAML for llmisvc.
KServe Storage Resource Configuration
charts/kserve-resources/README.md, charts/kserve-resources/templates/clusterstoragecontainer.yaml, charts/kserve-resources/values.yaml
Adds configurable resource requests/limits for storage-initializer via Helm values and documents new keys.
Gateway API Import Alias Standardization
cmd/manager/main.go, pkg/apis/serving/v1alpha1/llm_inference_service_types.go, pkg/controller/v1beta1/inferenceservice/controller.go, pkg/controller/v1beta1/inferenceservice/reconcilers/ingress/httproute_reconciler.go, pkg/controller/v1beta1/inferenceservice/reconcilers/ingress/httproute_reconciler_test.go, pkg/controller/v1beta1/inferenceservice/suite_test.go, pkg/testing/envtest_setup.go
Renames Gateway API import alias to gwapiv1 throughout controllers and tests; updates all type references and usages accordingly.
Test CRD Path Resolution Improvements
pkg/controller/v1alpha1/inferencegraph/suite_test.go, pkg/controller/v1alpha1/localmodel/suite_test.go, pkg/controller/v1alpha1/localmodelnode/suite_test.go, pkg/controller/v1alpha1/trainedmodel/suite_test.go, pkg/credentials/credentials_suite_test.go, pkg/webhook/admission/pod/suite_test.go, pkg/testing/utils.go
Replaces relative CRD path logic in tests with new ProjectRoot() utility for robust project root detection.
Go Module & Linting Updates
go.mod, .golangci.yml
Adds new indirect dependencies for controller and linting; updates linter config with new import alias rule.
Python PredictorConfig Refactor & Context
python/kserve/kserve/__init__.py, python/kserve/kserve/context.py, python/kserve/kserve/predictor_config.py, python/kserve/kserve/model.py, python/kserve/kserve/model_server.py, python/kserve/kserve/protocol/dataplane.py, python/kserve/test/test_dataplane.py, python/kserve/test/test_server.py, python/huggingfaceserver/huggingfaceserver/encoder_model.py, python/huggingfaceserver/huggingfaceserver/__main__.py, python/huggingfaceserver/tests/test_model.py, python/aiffairness/aifserver/__main__.py, python/aiffairness/aifserver/model.py, python/artexplainer/artserver/__main__.py, python/artexplainer/artserver/model.py, python/custom_tokenizer/transformer.py, python/custom_transformer/model.py, python/custom_transformer/model_grpc.py
Refactors predictor config to new class and context-based access; removes direct passing of predictor config to models and servers; updates all usages and tests.
S3 Storage CA Bundle Handling & Tests
python/kserve/kserve/storage/storage.py, python/kserve/kserve/storage/test/test_s3_storage.py, python/storage/kserve_storage/kserve_storage.py, python/storage/test/test_s3_storage.py
Refactors S3 CA bundle file path logic to use os.path.join; adds comprehensive unit tests for CA bundle and SSL verification handling.
CI Workflow Update
.github/workflows/go.yml
Splits Go CI workflow into test and coverage jobs, adds coverage comparison/reporting, and modularizes steps.
Miscellaneous
docs/openshift/cert-manager/operator.yaml
Removes extraneous empty spec: {} field from OperatorGroup manifest.

Sequence Diagram(s)

sequenceDiagram
    participant User
    participant K8s API Server
    participant llmisvc-controller
    participant ConfigMap/CRD
    User->>K8s API Server: Create/Update LLMInferenceService
    K8s API Server->>llmisvc-controller: Reconcile LLMInferenceService event
    llmisvc-controller->>ConfigMap/CRD: Fetch LLMInferenceServiceConfig
    llmisvc-controller->>llmisvc-controller: MergeSpecs & ReplaceVariables
    llmisvc-controller-->>K8s API Server: Update LLMInferenceService status
Loading
sequenceDiagram
    participant ModelServer
    participant kserve.context
    participant DataPlane
    participant PredictorConfig

    ModelServer->>kserve.context: set_predictor_config(PredictorConfig)
    ModelServer->>DataPlane: instantiate (no predictor_config param)
    DataPlane->>kserve.context: get_predictor_config()
    DataPlane->>PredictorConfig: Use config for inference
Loading

Estimated code review effort

🎯 4 (Complex) | ⏱️ ~45 minutes

Possibly related PRs

  • opendatahub-io/kserve#752: Introduces LLMInferenceService and LLMInferenceServiceConfig CRDs and related controllers, directly related to the new controller and CRDs added here.
  • opendatahub-io/kserve#723: Adds validating webhook for LLMInferenceServiceConfig, related to config and validation logic in this PR.
  • opendatahub-io/kserve#774: Adds end-to-end test skeleton for LLMInferenceService, overlapping with new test infrastructure and controller changes in this PR.

Suggested labels

approved, lgtm

Suggested reviewers

  • pierDipi
  • andresllh

Poem

A rabbit hops through Helm and Go,
Merging configs high and low.
With charts and Docker, tests anew,
CA bundles checked, and context too!
Now LLMs can serve with grace—
This patch puts smiles on every face.
🐇✨

Warning

There were issues while running some tools. Please review the errors and either fix the tool's configuration or disable the tool if it's a critical failure.

🔧 golangci-lint (2.2.2)

Error: can't load config: unsupported version of the configuration: "" See https://golangci-lint.run/product/migration-guide for migration instructions
The command is terminated due to an error: can't load config: unsupported version of the configuration: "" See https://golangci-lint.run/product/migration-guide for migration instructions

Note

⚡️ Unit Test Generation is now available in beta!

Learn more here, or try it out under "Finishing Touches" below.

✨ Finishing Touches
  • 📝 Generate Docstrings
🧪 Generate unit tests
  • Create PR with unit tests
  • Post copyable unit tests in a comment

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share
🪧 Tips

Chat

There are 3 ways to chat with CodeRabbit:

  • Review comments: Directly reply to a review comment made by CodeRabbit. Example:
    • I pushed a fix in commit <commit_id>, please review it.
    • Explain this complex logic.
    • Open a follow-up GitHub issue for this discussion.
  • Files and specific lines of code (under the "Files changed" tab): Tag @coderabbitai in a new review comment at the desired location with your query. Examples:
    • @coderabbitai explain this code block.
    • @coderabbitai modularize this function.
  • PR comments: Tag @coderabbitai in a new PR comment to ask questions about the PR branch. For the best results, please provide a very specific query, as very limited context is provided in this mode. Examples:
    • @coderabbitai gather interesting stats about this repository and render them as a table. Additionally, render a pie chart showing the language distribution in the codebase.
    • @coderabbitai read src/utils.ts and explain its main purpose.
    • @coderabbitai read the files in the src/scheduler package and generate a class diagram using mermaid and a README in the markdown format.
    • @coderabbitai help me debug CodeRabbit configuration file.

Support

Need help? Create a ticket on our support page for assistance with any issues or questions.

Note: Be mindful of the bot's finite context window. It's strongly recommended to break down tasks such as reading entire modules into smaller chunks. For a focused discussion, use review comments to chat about specific files and their changes, instead of using the PR comments.

CodeRabbit Commands (Invoked using PR comments)

  • @coderabbitai pause to pause the reviews on a PR.
  • @coderabbitai resume to resume the paused reviews.
  • @coderabbitai review to trigger an incremental review. This is useful when automatic reviews are disabled for the repository.
  • @coderabbitai full review to do a full review from scratch and review all the files again.
  • @coderabbitai summary to regenerate the summary of the PR.
  • @coderabbitai generate docstrings to generate docstrings for this PR.
  • @coderabbitai generate sequence diagram to generate a sequence diagram of the changes in this PR.
  • @coderabbitai generate unit tests to generate unit tests for this PR.
  • @coderabbitai resolve resolve all the CodeRabbit review comments.
  • @coderabbitai configuration to show the current CodeRabbit configuration for the repository.
  • @coderabbitai help to get help.

Other keywords and placeholders

  • Add @coderabbitai ignore anywhere in the PR description to prevent this PR from being reviewed.
  • Add @coderabbitai summary to generate the high-level summary at a specific location in the PR description.
  • Add @coderabbitai anywhere in the PR title to generate the title automatically.

CodeRabbit Configuration File (.coderabbit.yaml)

  • You can programmatically configure CodeRabbit by adding a .coderabbit.yaml file to the root of your repository.
  • Please see the configuration documentation for more information.
  • If your editor has YAML language server enabled, you can add the path at the top of this file to enable auto-completion and validation: # yaml-language-server: $schema=https://coderabbit.ai/integrations/schema.v2.json

Documentation and Community

  • Visit our Documentation for detailed information on how to use CodeRabbit.
  • Join our Discord Community to get help, request features, and share feedback.
  • Follow us on X/Twitter for updates and announcements.

Copy link
Copy Markdown

@coderabbitai coderabbitai Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 7

🧹 Nitpick comments (16)
charts/kserve-resources/values.yaml (1)

18-24: Align quoting style for CPU/memory values to keep the chart consistent

cpu: "1" is quoted while the other numeric‐like values (100m, 100Mi, 1Gi) are not. Kubernetes parses these as strings anyway, but the mixed quoting style is visually inconsistent with the rest of the chart and may confuse users skimming the defaults.

Consider either quoting all four entries or leaving all of them unquoted for uniformity.

charts/kserve-resources/templates/clusterstoragecontainer.yaml (1)

9-12: Suppress empty resources: block when no overrides are provided

resources: is rendered unconditionally, while the contents are gated by with.
If the user removes the default values, Helm produces:

resources:

which serialises to null in the K8s manifest. Although this is accepted by the API, omitting the field entirely is cleaner and avoids spurious diffs.

-    resources:
-      {{- with .Values.kserve.storage.resources }}
-      {{- toYaml . | nindent 6 }}
-      {{- end }}
+    {{- with .Values.kserve.storage.resources }}
+    resources:
+      {{- toYaml . | nindent 6 }}
+    {{- end }}

This nests resources: inside the with block so it is emitted only when a map is present.

python/kserve/kserve/model.py (1)

300-313: Consider simplifying nested if statements.

The null check is correctly added. However, the nested if statements can be simplified as suggested by static analysis:

-        predictor_config = self.predictor_config
-        if predictor_config is not None:
-            if predictor_config.protocol == PredictorProtocol.REST_V2.value:
-                if "inputs" in payload and not isinstance(payload["inputs"], list):
-                    raise InvalidInput('Expected "inputs" to be a list')
-            elif predictor_config.protocol == PredictorProtocol.REST_V1.value:
-                if (
-                    isinstance(payload, Dict)
-                    and "instances" in payload
-                    and not isinstance(payload["instances"], list)
-                ):
-                    raise InvalidInput('Expected "instances" to be a list')
+        predictor_config = self.predictor_config
+        if predictor_config is not None and predictor_config.protocol == PredictorProtocol.REST_V2.value:
+            if "inputs" in payload and not isinstance(payload["inputs"], list):
+                raise InvalidInput('Expected "inputs" to be a list')
+        elif predictor_config is not None and predictor_config.protocol == PredictorProtocol.REST_V1.value:
+            if (
+                isinstance(payload, Dict)
+                and "instances" in payload
+                and not isinstance(payload["instances"], list)
+            ):
+                raise InvalidInput('Expected "instances" to be a list')
charts/llmisvc-crd/README.md (1)

11-13: Minor markdown nit – remove leading $ to silence MD014

Tooling complains that prompts start with $ without showing output.

-```console
-$ helm install llmisvc-crd oci://ghcr.io/kserve/charts/llmisvc-crd --version v0.15.2
+```console
+helm install llmisvc-crd oci://ghcr.io/kserve/charts/llmisvc-crd --version v0.15.2
charts/llmisvc-crd/README.md.gotmpl (1)

10-12: Nit – same MD014 issue in template

Drop the leading $ so generated READMEs pass markdown-lint.

-```console
-$ helm install llmisvc-crd oci://ghcr.io/kserve/charts/llmisvc-crd --version {{ template "chart.version" . }}
+```console
+helm install llmisvc-crd oci://ghcr.io/kserve/charts/llmisvc-crd --version {{ template "chart.version" . }}
config/rbac/llmisvc/role.yaml (1)

7-27: ClusterRole definition looks correct but consider adding deletecollection

The verbs list omits deletecollection, which controllers commonly need for garbage-collection style clean-ups (e.g., when deleting many CRs during finalizer handling).
If the controller never calls client.DeleteAllOf(...) this is harmless, otherwise the RBAC will block the call.

   - delete
+  - deletecollection
pkg/testing/envtest_setup.go (1)

30-40: Avoid heavy google.golang.org/protobuf/proto import just for a bool pointer

Importing the full protobuf runtime only to create a *bool is unnecessary and slows compilation.
Use the lightweight utility already in the repo (k8s.io/utils/pointer or your local pkg/utils.Pointer/Bool) instead.

-	"google.golang.org/protobuf/proto"
+	"k8s.io/utils/pointer"
...
-		UseExistingCluster:    proto.Bool(false),
+		UseExistingCluster:    pointer.Bool(false),
pkg/apis/serving/v1alpha1/llm_inference_service_types.go (2)

192-193: Type coupling to upstream Gateway API versions

HTTPRouteSpec embeds *gwapiv1.HTTPRouteSpec. When Gateway-API releases a breaking change, this will cascade into your CRD.
Consider copying only the fields you need (or shallow-wrapping) to decouple your API from upstream version bumps.


275-278: Namespace field uses gwapiv1.Namespace – check CRD validation

gwapiv1.Namespace is an alias for string but without the Kubernetes-style pattern validation (^[a-z0-9]([-a-z0-9]*[a-z0-9])?$).
If you rely on CRD structural schema validation, switch to string or add an explicit kubebuilder:validation:Pattern marker.

-	Namespace gwapiv1.Namespace `json:"namespace,omitempty"`
+	// +kubebuilder:validation:Pattern=`^[a-z0-9]([-a-z0-9]*[a-z0-9])?$`
+	Namespace string `json:"namespace,omitempty"`
charts/llmisvc-resources/templates/clusterrole.yaml (1)

1-27: Keep RBAC in Helm chart and config/ stanza in sync

This ClusterRole duplicates config/rbac/llmisvc/role.yaml. Any future change must be applied in both places or drift will occur.
Consider generating the ClusterRole once (e.g., with kustomize or Helm helpers) and re-using it to avoid maintenance overhead.

charts/llmisvc-crd/Chart.yaml (1)

4-4: Fix trailing spaces.

Static analysis detected trailing spaces on this line.

-description: Helm chart for deploying LLMInferenceService crds 
+description: Helm chart for deploying LLMInferenceService crds
pkg/controller/v1alpha1/llmisvc/controller.go (1)

40-45: Missing reconciliation logic implementation.

The Reconcile method currently only logs but doesn't implement any reconciliation logic as indicated by the TODO comment on line 43.

Would you like me to help implement the reconciliation logic or open an issue to track this task?

charts/llmisvc-resources/templates/deployment.yaml (1)

147-150: Consider making the webhook certificate secret name configurable.

The secret name "kserve-webhook-server-cert" is hardcoded. Consider making it configurable through values for flexibility across different deployments.

Add to values.yaml:

kserve:
  llmisvc:
    controller:
      webhookCertSecret: "kserve-webhook-server-cert"

Then update the deployment template:

-            secretName: kserve-webhook-server-cert
+            secretName: {{ .Values.kserve.llmisvc.controller.webhookCertSecret }}
pkg/controller/v1alpha1/llmisvc/config_merge.go (1)

31-32: Add error handling for json.Marshal

While unlikely to fail with valid structs, it's a best practice to handle all errors consistently.

-	templateBytes, _ := json.Marshal(llmSvcCfg)
+	templateBytes, err := json.Marshal(llmSvcCfg)
+	if err != nil {
+		return nil, fmt.Errorf("failed to marshal config: %w", err)
+	}
charts/llmisvc-resources/templates/_helpers.tpl (1)

51-53: Consider parameterizing the deployment name

The deployment name is hardcoded to kserve-llmisvc-controller-manager. Consider making it configurable for flexibility in different deployment scenarios.

 {{- define "llm-isvc-resources.deploymentName" -}}
-kserve-llmisvc-controller-manager
+{{- default "kserve-llmisvc-controller-manager" .Values.kserve.llmisvc.controller.deploymentName }}
 {{- end }}
python/kserve/kserve/storage/test/test_s3_storage.py (1)

502-523: Combine nested with statements for better readability.

The test logic is correct, but the nested with statements can be combined.

Apply this diff to combine the with statements:

-    # Set AWS_CA_BUNDLE to a non-existent file
-    with mock.patch.dict(
-        os.environ,
-        {"AWS_CA_BUNDLE": non_existent_path, "S3_VERIFY_SSL": "true"},
-        clear=True,
-    ):
-        with pytest.raises(
-            RuntimeError,
-            match=f"Failed to find ca bundle file\\({non_existent_path}\\)",
-        ):
-            Storage._download_s3(f"s3://{bucket_name}/{object_key}", "dest_path")
+    # Set AWS_CA_BUNDLE to a non-existent file
+    with mock.patch.dict(
+        os.environ,
+        {"AWS_CA_BUNDLE": non_existent_path, "S3_VERIFY_SSL": "true"},
+        clear=True,
+    ), pytest.raises(
+        RuntimeError,
+        match=f"Failed to find ca bundle file\\({non_existent_path}\\)",
+    ):
+        Storage._download_s3(f"s3://{bucket_name}/{object_key}", "dest_path")
📜 Review details

Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between 36f1080 and 19a0063.

⛔ Files ignored due to path filters (1)
  • go.sum is excluded by !**/*.sum
📒 Files selected for processing (64)
  • .github/workflows/go.yml (1 hunks)
  • .golangci.yml (1 hunks)
  • Makefile (1 hunks)
  • charts/kserve-resources/README.md (1 hunks)
  • charts/kserve-resources/templates/clusterstoragecontainer.yaml (1 hunks)
  • charts/kserve-resources/values.yaml (1 hunks)
  • charts/llmisvc-crd/Chart.yaml (1 hunks)
  • charts/llmisvc-crd/README.md (1 hunks)
  • charts/llmisvc-crd/README.md.gotmpl (1 hunks)
  • charts/llmisvc-resources/Chart.yaml (1 hunks)
  • charts/llmisvc-resources/README.md (1 hunks)
  • charts/llmisvc-resources/README.md.gotmpl (1 hunks)
  • charts/llmisvc-resources/templates/_helpers.tpl (1 hunks)
  • charts/llmisvc-resources/templates/clusterrole.yaml (1 hunks)
  • charts/llmisvc-resources/templates/deployment.yaml (1 hunks)
  • charts/llmisvc-resources/templates/service.yaml (1 hunks)
  • charts/llmisvc-resources/templates/serviceaccount.yaml (1 hunks)
  • charts/llmisvc-resources/values.schema.json (1 hunks)
  • charts/llmisvc-resources/values.yaml (1 hunks)
  • cmd/llmisvc/main.go (1 hunks)
  • cmd/manager/main.go (2 hunks)
  • config/rbac/llmisvc/role.yaml (1 hunks)
  • docs/openshift/cert-manager/operator.yaml (0 hunks)
  • go.mod (5 hunks)
  • llmisvc-controller.Dockerfile (1 hunks)
  • pkg/apis/serving/v1alpha1/llm_inference_service_types.go (3 hunks)
  • pkg/controller/v1alpha1/inferencegraph/suite_test.go (1 hunks)
  • pkg/controller/v1alpha1/llmisvc/config_merge.go (1 hunks)
  • pkg/controller/v1alpha1/llmisvc/config_merge_test.go (1 hunks)
  • pkg/controller/v1alpha1/llmisvc/controller.go (1 hunks)
  • pkg/controller/v1alpha1/llmisvc/controller_types.go (1 hunks)
  • pkg/controller/v1alpha1/localmodel/suite_test.go (1 hunks)
  • pkg/controller/v1alpha1/localmodelnode/suite_test.go (1 hunks)
  • pkg/controller/v1alpha1/trainedmodel/suite_test.go (1 hunks)
  • pkg/controller/v1beta1/inferenceservice/controller.go (3 hunks)
  • pkg/controller/v1beta1/inferenceservice/reconcilers/ingress/httproute_reconciler.go (28 hunks)
  • pkg/controller/v1beta1/inferenceservice/reconcilers/ingress/httproute_reconciler_test.go (61 hunks)
  • pkg/controller/v1beta1/inferenceservice/suite_test.go (3 hunks)
  • pkg/credentials/credentials_suite_test.go (1 hunks)
  • pkg/testing/envtest_setup.go (2 hunks)
  • pkg/testing/utils.go (1 hunks)
  • pkg/webhook/admission/pod/suite_test.go (1 hunks)
  • python/aiffairness/aifserver/__main__.py (0 hunks)
  • python/aiffairness/aifserver/model.py (1 hunks)
  • python/artexplainer/artserver/__main__.py (0 hunks)
  • python/artexplainer/artserver/model.py (1 hunks)
  • python/custom_tokenizer/transformer.py (2 hunks)
  • python/custom_transformer/model.py (5 hunks)
  • python/custom_transformer/model_grpc.py (2 hunks)
  • python/huggingfaceserver/huggingfaceserver/__main__.py (0 hunks)
  • python/huggingfaceserver/huggingfaceserver/encoder_model.py (5 hunks)
  • python/huggingfaceserver/tests/test_model.py (1 hunks)
  • python/kserve/kserve/__init__.py (1 hunks)
  • python/kserve/kserve/context.py (1 hunks)
  • python/kserve/kserve/model.py (7 hunks)
  • python/kserve/kserve/model_server.py (3 hunks)
  • python/kserve/kserve/predictor_config.py (1 hunks)
  • python/kserve/kserve/protocol/dataplane.py (2 hunks)
  • python/kserve/kserve/storage/storage.py (1 hunks)
  • python/kserve/kserve/storage/test/test_s3_storage.py (2 hunks)
  • python/kserve/test/test_dataplane.py (15 hunks)
  • python/kserve/test/test_server.py (3 hunks)
  • python/storage/kserve_storage/kserve_storage.py (1 hunks)
  • python/storage/test/test_s3_storage.py (2 hunks)
💤 Files with no reviewable changes (4)
  • python/aiffairness/aifserver/main.py
  • python/huggingfaceserver/huggingfaceserver/main.py
  • docs/openshift/cert-manager/operator.yaml
  • python/artexplainer/artserver/main.py
🧰 Additional context used
🧠 Learnings (7)
📓 Common learnings
Learnt from: bartoszmajsak
PR: opendatahub-io/kserve#757
File: pkg/controller/llmisvc/router_discovery.go:45-49
Timestamp: 2025-07-21T17:40:11.003Z
Learning: In the KServe LLMInferenceService controller, the DiscoverURLs function in router_discovery.go intentionally returns nil, nil for missing Gateways rather than errors. This allows the reconciler to continue processing and update status conditions through the separate validation logic in ref_validation.go, preventing reconciliation loops that would occur if errors were returned.
Learnt from: bartoszmajsak
PR: opendatahub-io/kserve#729
File: test/overlays/llm-istio-experimental/istiod.yaml:2559-2566
Timestamp: 2025-07-10T14:58:38.489Z
Learning: In the kserve repository, files in `test/overlays/llm-istio-experimental/` directory are considered external files by the maintainers and should not be flagged for YAML formatting issues or other code quality concerns.
📚 Learning: in pkg/apis/serving/v1beta1/configmap.go, the getstorageinitializerconfigs function was moved from a...
Learnt from: israel-hdez
PR: opendatahub-io/kserve#738
File: pkg/apis/serving/v1beta1/configmap.go:0-0
Timestamp: 2025-07-17T16:19:36.771Z
Learning: In pkg/apis/serving/v1beta1/configmap.go, the GetStorageInitializerConfigs function was moved from another package to prevent circular imports, which explains why it may contain existing patterns rather than being newly written code.

Applied to files:

  • .golangci.yml
  • cmd/manager/main.go
  • pkg/controller/v1alpha1/localmodelnode/suite_test.go
  • pkg/controller/v1alpha1/trainedmodel/suite_test.go
  • pkg/controller/v1alpha1/inferencegraph/suite_test.go
  • charts/kserve-resources/templates/clusterstoragecontainer.yaml
  • pkg/apis/serving/v1alpha1/llm_inference_service_types.go
  • pkg/controller/v1beta1/inferenceservice/suite_test.go
  • python/kserve/kserve/__init__.py
  • pkg/controller/v1alpha1/localmodel/suite_test.go
  • pkg/testing/envtest_setup.go
  • pkg/controller/v1beta1/inferenceservice/controller.go
  • go.mod
  • charts/kserve-resources/values.yaml
  • pkg/controller/v1alpha1/llmisvc/controller_types.go
  • pkg/controller/v1alpha1/llmisvc/config_merge.go
  • pkg/controller/v1alpha1/llmisvc/config_merge_test.go
  • pkg/controller/v1beta1/inferenceservice/reconcilers/ingress/httproute_reconciler.go
📚 Learning: in the kserve llminferenceservice controller, the discoverurls function in router_discovery.go inten...
Learnt from: bartoszmajsak
PR: opendatahub-io/kserve#757
File: pkg/controller/llmisvc/router_discovery.go:45-49
Timestamp: 2025-07-21T17:40:11.003Z
Learning: In the KServe LLMInferenceService controller, the DiscoverURLs function in router_discovery.go intentionally returns nil, nil for missing Gateways rather than errors. This allows the reconciler to continue processing and update status conditions through the separate validation logic in ref_validation.go, preventing reconciliation loops that would occur if errors were returned.

Applied to files:

  • cmd/manager/main.go
  • pkg/apis/serving/v1alpha1/llm_inference_service_types.go
  • pkg/controller/v1beta1/inferenceservice/suite_test.go
  • pkg/controller/v1beta1/inferenceservice/controller.go
  • pkg/controller/v1alpha1/llmisvc/controller.go
  • pkg/controller/v1alpha1/llmisvc/controller_types.go
  • pkg/controller/v1alpha1/llmisvc/config_merge_test.go
  • pkg/controller/v1beta1/inferenceservice/reconcilers/ingress/httproute_reconciler_test.go
  • pkg/controller/v1beta1/inferenceservice/reconcilers/ingress/httproute_reconciler.go
📚 Learning: the file `test/overlays/llm-istio-experimental/istiod.yaml` is considered external and irrelevant fo...
Learnt from: bartoszmajsak
PR: opendatahub-io/kserve#729
File: test/overlays/llm-istio-experimental/istiod.yaml:2025-2049
Timestamp: 2025-07-10T14:58:43.762Z
Learning: The file `test/overlays/llm-istio-experimental/istiod.yaml` is considered external and irrelevant for review in this repository.

Applied to files:

  • pkg/controller/v1alpha1/localmodel/suite_test.go
  • charts/llmisvc-resources/Chart.yaml
  • charts/llmisvc-resources/templates/deployment.yaml
  • charts/llmisvc-resources/values.yaml
  • pkg/controller/v1alpha1/llmisvc/config_merge_test.go
📚 Learning: the file `test/overlays/llm-istio-experimental/istio_base.yaml` is considered external and irrelevan...
Learnt from: bartoszmajsak
PR: opendatahub-io/kserve#729
File: test/overlays/llm-istio-experimental/istio_base.yaml:6854-6869
Timestamp: 2025-07-10T14:58:14.161Z
Learning: The file `test/overlays/llm-istio-experimental/istio_base.yaml` is considered external and irrelevant for review in this repository.

Applied to files:

  • pkg/controller/v1alpha1/localmodel/suite_test.go
  • charts/llmisvc-resources/Chart.yaml
  • charts/llmisvc-resources/values.yaml
  • pkg/controller/v1alpha1/llmisvc/config_merge_test.go
📚 Learning: in the kserve repository, files in `test/overlays/llm-istio-experimental/` directory are considered ...
Learnt from: bartoszmajsak
PR: opendatahub-io/kserve#729
File: test/overlays/llm-istio-experimental/istiod.yaml:2559-2566
Timestamp: 2025-07-10T14:58:38.489Z
Learning: In the kserve repository, files in `test/overlays/llm-istio-experimental/` directory are considered external files by the maintainers and should not be flagged for YAML formatting issues or other code quality concerns.

Applied to files:

  • charts/llmisvc-resources/templates/service.yaml
  • Makefile
  • charts/llmisvc-resources/Chart.yaml
  • charts/llmisvc-crd/Chart.yaml
  • charts/llmisvc-resources/README.md
  • charts/llmisvc-resources/values.yaml
  • charts/llmisvc-resources/values.schema.json
  • pkg/controller/v1alpha1/llmisvc/config_merge_test.go
📚 Learning: in pkg/types/config.go, the storageinitializerconfig struct field enableociimagesource uses the json...
Learnt from: israel-hdez
PR: opendatahub-io/kserve#738
File: pkg/types/config.go:30-30
Timestamp: 2025-07-17T16:20:08.917Z
Learning: In pkg/types/config.go, the StorageInitializerConfig struct field EnableOciImageSource uses the JSON tag "enableModelcar" (instead of matching the field name) to maintain backwards compatibility after the struct was moved from another package. The JSON tag cannot be changed to avoid breaking existing configurations.

Applied to files:

  • charts/kserve-resources/README.md
🧬 Code Graph Analysis (20)
pkg/credentials/credentials_suite_test.go (1)
pkg/testing/utils.go (1)
  • ProjectRoot (26-53)
pkg/webhook/admission/pod/suite_test.go (1)
pkg/testing/utils.go (1)
  • ProjectRoot (26-53)
pkg/controller/v1alpha1/localmodelnode/suite_test.go (1)
pkg/testing/utils.go (1)
  • ProjectRoot (26-53)
pkg/controller/v1alpha1/trainedmodel/suite_test.go (1)
pkg/testing/utils.go (1)
  • ProjectRoot (26-53)
pkg/controller/v1alpha1/inferencegraph/suite_test.go (1)
pkg/testing/utils.go (1)
  • ProjectRoot (26-53)
pkg/controller/v1beta1/inferenceservice/suite_test.go (1)
pkg/testing/utils.go (1)
  • ProjectRoot (26-53)
python/kserve/kserve/__init__.py (1)
python/kserve/kserve/predictor_config.py (1)
  • PredictorConfig (20-106)
pkg/controller/v1alpha1/localmodel/suite_test.go (1)
pkg/testing/utils.go (1)
  • ProjectRoot (26-53)
python/custom_transformer/model_grpc.py (1)
python/custom_transformer/model.py (1)
  • ImageTransformer (62-118)
python/kserve/kserve/model_server.py (3)
python/kserve/kserve/protocol/dataplane.py (3)
  • predictor_config (67-69)
  • DataPlane (49-507)
  • model_registry (63-64)
python/kserve/kserve/predictor_config.py (1)
  • PredictorConfig (20-106)
python/kserve/kserve/context.py (1)
  • set_predictor_config (28-36)
pkg/testing/envtest_setup.go (2)
pkg/utils/types.go (1)
  • Bool (19-21)
pkg/client/clientset/versioned/scheme/register.go (1)
  • Scheme (31-31)
pkg/controller/v1beta1/inferenceservice/controller.go (2)
pkg/utils/utils.go (1)
  • IsCrdAvailable (191-208)
pkg/constants/constants.go (1)
  • HTTPRouteKind (546-546)
python/kserve/kserve/context.py (3)
python/kserve/kserve/model.py (1)
  • predictor_config (168-170)
python/kserve/kserve/protocol/dataplane.py (1)
  • predictor_config (67-69)
python/kserve/kserve/predictor_config.py (1)
  • PredictorConfig (20-106)
python/custom_transformer/model.py (4)
python/kserve/kserve/constants/constants.py (1)
  • PredictorProtocol (85-88)
python/kserve/kserve/model.py (1)
  • predictor_config (168-170)
python/kserve/kserve/predictor_config.py (1)
  • predictor_protocol (53-55)
python/custom_transformer/model_grpc.py (1)
  • ImageTransformer (45-69)
python/huggingfaceserver/huggingfaceserver/encoder_model.py (3)
python/kserve/kserve/model.py (1)
  • predictor_config (168-170)
python/kserve/kserve/context.py (1)
  • get_predictor_config (39-45)
python/kserve/kserve/predictor_config.py (1)
  • predictor_host (48-50)
python/kserve/kserve/model.py (4)
python/kserve/kserve/predictor_config.py (7)
  • PredictorConfig (20-106)
  • predictor_host (48-50)
  • protocol (89-91)
  • timeout (94-96)
  • retries (99-101)
  • use_ssl (104-106)
  • predictor_base_url (78-86)
python/kserve/kserve/context.py (1)
  • get_predictor_config (39-45)
python/kserve/kserve/constants/constants.py (1)
  • PredictorProtocol (85-88)
python/kserve/kserve/errors.py (1)
  • InvalidInput (41-51)
python/kserve/kserve/predictor_config.py (1)
python/kserve/kserve/constants/constants.py (1)
  • PredictorProtocol (85-88)
python/storage/test/test_s3_storage.py (2)
python/kserve/kserve/storage/test/test_s3_storage.py (1)
  • create_mock_boto3_bucket (34-42)
python/kserve/kserve/storage/storage.py (2)
  • Storage (59-798)
  • _download_s3 (193-297)
pkg/controller/v1alpha1/llmisvc/config_merge.go (2)
pkg/apis/serving/v1alpha1/llm_inference_service_types.go (3)
  • LLMInferenceService (40-46)
  • LLMInferenceServiceConfig (52-57)
  • LLMInferenceServiceSpec (60-87)
pkg/controller/v1alpha1/llmisvc/controller_types.go (1)
  • Config (19-23)
python/kserve/kserve/protocol/dataplane.py (3)
python/kserve/kserve/predictor_config.py (10)
  • PredictorConfig (20-106)
  • protocol (89-91)
  • predictor_protocol (53-55)
  • retries (99-101)
  • predictor_request_retries (68-70)
  • timeout (94-96)
  • predictor_request_timeout_seconds (63-65)
  • predictor_host (48-50)
  • use_ssl (104-106)
  • predictor_use_ssl (58-60)
python/kserve/kserve/utils/inference_client_factory.py (3)
  • InferenceClientFactory (20-51)
  • get_rest_client (36-43)
  • get_grpc_client (31-34)
python/kserve/kserve/context.py (1)
  • get_predictor_config (39-45)
🪛 markdownlint-cli2 (0.17.2)
charts/llmisvc-crd/README.md

12-12: Dollar signs used before commands without showing output

(MD014, commands-show-output)

charts/llmisvc-resources/README.md

12-12: Dollar signs used before commands without showing output

(MD014, commands-show-output)

🪛 YAMLlint (1.37.1)
charts/kserve-resources/templates/clusterstoragecontainer.yaml

[warning] 11-11: wrong indentation: expected 4 but found 6

(indentation)


[warning] 12-12: wrong indentation: expected 4 but found 6

(indentation)


[error] 10-10: syntax error: expected the node content, but found '-'

(syntax)

charts/llmisvc-resources/templates/service.yaml

[warning] 9-9: wrong indentation: expected 2 but found 4

(indentation)


[error] 5-5: syntax error: expected , but found ''

(syntax)

charts/llmisvc-crd/Chart.yaml

[error] 4-4: trailing spaces

(trailing-spaces)

charts/llmisvc-resources/templates/deployment.yaml

[warning] 9-9: wrong indentation: expected 2 but found 4

(indentation)


[error] 8-8: syntax error: expected the node content, but found '-'

(syntax)

charts/llmisvc-resources/templates/serviceaccount.yaml

[warning] 9-9: wrong indentation: expected 2 but found 4

(indentation)


[error] 8-8: syntax error: expected the node content, but found '-'

(syntax)

.github/workflows/go.yml

[warning] 28-28: wrong indentation: expected 18 but found 15

(indentation)


[warning] 56-56: wrong indentation: expected 18 but found 16

(indentation)


[error] 62-62: syntax error: expected , but found ''

(syntax)


[warning] 74-74: wrong indentation: expected 20 but found 18

(indentation)


[error] 85-85: trailing spaces

(trailing-spaces)


[warning] 90-90: wrong indentation: expected 20 but found 18

(indentation)


[warning] 99-99: wrong indentation: expected 20 but found 18

(indentation)


[warning] 131-131: wrong indentation: expected 20 but found 18

(indentation)


[error] 134-134: trailing spaces

(trailing-spaces)


[warning] 139-139: wrong indentation: expected 20 but found 18

(indentation)


[error] 142-142: trailing spaces

(trailing-spaces)


[warning] 155-155: wrong indentation: expected 20 but found 18

(indentation)


[error] 161-161: trailing spaces

(trailing-spaces)


[error] 162-162: trailing spaces

(trailing-spaces)


[error] 163-163: trailing spaces

(trailing-spaces)


[error] 164-164: trailing spaces

(trailing-spaces)


[error] 165-165: trailing spaces

(trailing-spaces)


[error] 166-166: trailing spaces

(trailing-spaces)


[warning] 189-189: wrong indentation: expected 20 but found 18

(indentation)


[error] 191-191: no new line character at the end of file

(new-line-at-end-of-file)

🪛 Ruff (0.12.2)
python/kserve/kserve/__init__.py

18-18: .predictor_config.PredictorConfig imported but unused; consider removing, adding to __all__, or using a redundant alias

(F401)

python/kserve/kserve/model.py

305-310: Use a single if statement instead of nested if statements

(SIM102)

python/storage/test/test_s3_storage.py

513-521: Use a single with statement with multiple contexts instead of nested with statements

Combine with statements

(SIM117)

python/kserve/kserve/storage/test/test_s3_storage.py

513-521: Use a single with statement with multiple contexts instead of nested with statements

Combine with statements

(SIM117)

🪛 actionlint (1.7.7)
.github/workflows/go.yml

16-16: could not parse as YAML: yaml: line 16: did not find expected key

(syntax-check)

⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (11)
  • GitHub Check: build (3.10)
  • GitHub Check: build (3.12)
  • GitHub Check: build (3.11)
  • GitHub Check: build (3.9)
  • GitHub Check: test
  • GitHub Check: precommit-check
  • GitHub Check: test
  • GitHub Check: test
  • GitHub Check: test
  • GitHub Check: test
  • GitHub Check: test
🔇 Additional comments (90)
charts/kserve-resources/README.md (1)

205-208: Docs update looks good – resources keys are now discoverable

The new rows correctly expose the four kserve.storage.resources.* keys with their defaults. No issues spotted.

python/kserve/kserve/__init__.py (1)

18-18: LGTM! The import correctly exposes PredictorConfig in the public API.

The addition of PredictorConfig import aligns with the refactoring that extracted this class into its own module. The static analysis warning about unused import is a false positive - this is the expected pattern for __init__.py files to expose classes in the package's public API.

python/aiffairness/aifserver/model.py (1)

15-24: LGTM! Changes align with the predictor configuration refactoring.

The import reordering and removal of predictor configuration parameters are consistent with the broader architectural shift to context-based predictor configuration management. This simplifies the model initialization by removing direct dependency on PredictorConfig.

python/artexplainer/artserver/model.py (1)

38-38: LGTM! Constructor simplification aligns with the refactoring.

The removal of the predictor_config parameter and simplified superclass initialization is consistent with the broader architectural shift to context-based predictor configuration management across KServe model components.

python/huggingfaceserver/tests/test_model.py (1)

18-18: LGTM! Test cleanup aligns with the architectural refactoring.

The removal of imports and the associated test_bert_predictor_host test function is appropriate given the shift to context-based predictor configuration management. The test would have become obsolete under the new architecture where PredictorConfig is no longer passed directly to model constructors.

python/custom_tokenizer/transformer.py (2)

34-35: LGTM! Constructor simplification aligns with the refactoring.

The removal of the predictor_config parameter and simplified constructor signature is consistent with the architectural shift to context-based predictor configuration management across KServe components.


120-120: LGTM! Model instantiation properly simplified.

The updated model instantiation correctly removes the PredictorConfig creation and parameter passing, aligning with the new context-based configuration approach.

python/kserve/kserve/model_server.py (3)

27-27: LGTM: Correct import for context management.

The import of the context module enables the new context-based configuration pattern.


37-37: LGTM: Correct import path for PredictorConfig.

The import has been correctly updated to reference the new dedicated predictor_config module.


267-270: LGTM: Proper implementation of context-based configuration.

The predictor configuration is now correctly set in the global context before creating the DataPlane instance. This enables components to retrieve the configuration dynamically without explicit parameter passing.

python/custom_transformer/model_grpc.py (2)

50-50: LGTM: Constructor correctly simplified.

The removal of the predictor_config parameter aligns with the new context-based configuration approach where the configuration is retrieved dynamically from the global context.


78-78: LGTM: Model instantiation properly simplified.

The model instantiation now correctly passes only the required model name, following the new pattern where predictor configuration is managed through context.

python/kserve/kserve/context.py (3)

23-25: LGTM: Proper context variable initialization.

The use of contextvars.ContextVar provides thread-safe and async-friendly context management, with appropriate default value of None.


28-36: LGTM: Well-documented setter function.

The set_predictor_config function is properly implemented with clear documentation about when it should be called (during ModelServer initialization).


39-45: LGTM: Clean getter implementation.

The get_predictor_config function correctly retrieves the configuration from context and properly handles the case when no configuration is set (returns None).

python/kserve/test/test_dataplane.py (3)

38-41: LGTM: Correct import updates for context-based approach.

The imports have been properly updated to support the new context-based configuration pattern, importing from the new predictor_config module and adding the context module.


467-476: LGTM: Proper test pattern for context-based configuration.

The test correctly follows the new pattern: create PredictorConfig, set it in the global context, then create DataPlane. This ensures the configuration is available when DataPlane needs it.


484-494: LGTM: Consistent test implementation for gRPC scenario.

The test properly implements the context-based configuration pattern for gRPC with SSL, maintaining consistency with other test methods.

python/huggingfaceserver/huggingfaceserver/encoder_model.py (5)

66-66: LGTM: Correct context import addition.

The import of the context module enables dynamic retrieval of predictor configuration throughout the model lifecycle.


106-106: LGTM: Constructor properly simplified.

The removal of the predictor_config parameter is correct, as the configuration will now be retrieved dynamically from context when needed.


185-187: LGTM: Proper context-based configuration retrieval with null checking.

The dynamic retrieval of predictor_config from context is correctly implemented with appropriate null checking before accessing the predictor_host property.


223-224: LGTM: Consistent context usage in preprocess method.

The preprocess method correctly retrieves predictor configuration from context and performs proper null checking before accessing properties.


273-274: LGTM: Proper context usage in predict method.

The predict method consistently follows the pattern of retrieving predictor configuration from context with appropriate null checking.

python/custom_transformer/model.py (3)

33-33: LGTM! Import change aligns with the refactoring.

The import correctly changes from PredictorConfig to PredictorProtocol, which is still needed for protocol comparison in the preprocess and postprocess methods.


63-64: Constructor correctly simplified.

The removal of predictor_config parameter aligns with the new context-based configuration approach. The addition of return_response_headers=True is appropriate for transformers that may need to access response headers.


127-127: LGTM! Model instantiation correctly simplified.

The instantiation now only passes the model name, consistent with the simplified constructor and context-based configuration approach.

python/kserve/kserve/predictor_config.py (1)

20-107: Well-designed configuration class implementation.

The PredictorConfig class is properly implemented with:

  • Good encapsulation using private attributes and public properties
  • Sensible defaults for all configuration parameters
  • Backward compatibility through alias properties
  • Correct URL construction in predictor_base_url based on SSL flag

The class effectively centralizes predictor configuration management.

python/kserve/kserve/protocol/dataplane.py (4)

32-33: LGTM! Imports correctly updated.

The imports are properly adjusted to support the context-based predictor configuration approach.

Also applies to: 37-37


52-52: Constructor correctly simplified.

The removal of predictor_config parameter aligns with the new context-based configuration management approach.


66-70: Property correctly implements context-based retrieval.

The predictor_config property properly retrieves the configuration from the shared context, with appropriate typing indicating it may be None.


74-86: Excellent error handling in client properties.

Both rest_client and grpc_client properties properly handle the case where predictor_config is None by raising RuntimeError with clear, descriptive messages. This prevents confusing errors downstream and makes debugging easier.

Also applies to: 91-101

python/kserve/kserve/model.py (4)

23-24: LGTM! Imports correctly added.

The imports for context module and PredictorConfig are properly added to support the new architecture.


167-171: Property correctly implements context-based retrieval.

The predictor_config property is well-implemented with clear documentation that it may return None.


262-276: Excellent error handling in HTTP client property.

The property correctly handles the None case with a clear error message and maintains the existing check for predictor_host.


421-426: Consistent error handling in predict method.

The method correctly checks for None predictor_config and raises NotImplementedError with a clear message, maintaining consistency with the existing error handling pattern.

python/kserve/test/test_server.py (3)

53-57: LGTM! Import reorganization improves code organization.

The imports are now properly grouped and the new imports (PredictorConfig, kserve.context) are essential for the context-based configuration management being tested.


381-408: Excellent test coverage for protocol-specific validation.

The extended test properly validates the model's behavior under different predictor protocols (REST_V1 and REST_V2) using the new context-based configuration approach. The test structure is clear and comprehensive.


1321-1414: Comprehensive test coverage for context-based PredictorConfig integration.

This new test class excellently validates:

  1. ModelServer initialization - Proper setting of PredictorConfig in global context
  2. DataPlane integration - Functional operations with context-based config access
  3. Backwards compatibility - Ensuring existing functionality remains intact

The tests are well-structured with appropriate assertions for all configuration attributes and functional verification through actual inference calls.

cmd/manager/main.go (1)

238-242: Gateway API alias update looks good

The switch to gwapiv1.Install aligns with the new project-wide alias and compiles against gateway-api v1 which exports Install. No further changes needed.

.golangci.yml (1)

219-221: Alias list updated correctly

Adding gwapiv1 keeps linter and source imports consistent with the refactor. ✅

pkg/credentials/credentials_suite_test.go (1)

40-40: LGTM! Excellent standardization of CRD path resolution.

The change from hardcoded relative path to pkgtest.ProjectRoot() improves test robustness by eliminating dependency on the current working directory. This aligns with the consistent pattern being applied across multiple test suites in this PR.

python/storage/kserve_storage/kserve_storage.py (1)

224-226: LGTM! Proper cross-platform path construction.

Replacing string concatenation with os.path.join for CA bundle path construction is the correct approach for cross-platform compatibility and follows Python best practices for file path handling.

pkg/webhook/admission/pod/suite_test.go (1)

42-42: LGTM! Consistent standardization pattern.

The change to use pkgtest.ProjectRoot() matches the standardization pattern applied across other test suites in this PR, improving test reliability and eliminating hardcoded relative paths.

pkg/controller/v1alpha1/localmodelnode/suite_test.go (1)

64-64: LGTM! Consistent CRD path standardization.

The change to use pkgtest.ProjectRoot() continues the consistent pattern of standardizing CRD path resolution across test suites, improving reliability and maintainability.

charts/llmisvc-resources/README.md.gotmpl (1)

1-17: LGTM! Well-structured Helm chart README template.

The template follows standard Helm chart documentation patterns with proper Go templating syntax. The OCI registry installation command and comprehensive sections (header, description, badges, installation, requirements, values) provide a solid foundation for the LLMInferenceService chart documentation.

pkg/controller/v1beta1/inferenceservice/suite_test.go (3)

40-40: LGTM: Gateway API import alias standardization

The change from gatewayapiv1 to gwapiv1 aligns with the coordinated effort to standardize Gateway API import aliases across the codebase.


73-73: LGTM: Improved CRD path resolution

Replacing the hardcoded relative path with pkgtest.ProjectRoot() makes the test setup more robust and maintainable. This dynamic approach eliminates fragile relative path constructions and is consistent with similar updates across other test suites.


99-99: LGTM: Consistent alias usage

The reference to gwapiv1.Install() correctly uses the updated import alias.

pkg/controller/v1alpha1/inferencegraph/suite_test.go (1)

67-67: LGTM: Consistent CRD path resolution improvement

This change aligns with the coordinated effort to replace hardcoded relative paths with dynamic project root resolution using pkgtest.ProjectRoot(). The approach improves test robustness and maintainability.

pkg/controller/v1alpha1/localmodel/suite_test.go (1)

59-59: LGTM: Consistent CRD path resolution improvement

The change to use pkgtest.ProjectRoot() for CRD directory path construction follows the established pattern across other test suites and improves test environment setup robustness.

pkg/controller/v1alpha1/llmisvc/controller_types.go (1)

19-23: LGTM: Well-designed configuration struct

The Config struct is cleanly implemented with:

  • Appropriate field naming and types
  • Proper JSON serialization tags with omitempty for optional configuration
  • Clear purpose for llmisvc controller configuration

This follows Go best practices and integrates well with the broader LLMInferenceService support being added.

pkg/controller/v1alpha1/trainedmodel/suite_test.go (1)

68-68: LGTM: Final consistent CRD path resolution improvement

This change completes the coordinated effort to standardize CRD path resolution across test suites using pkgtest.ProjectRoot(), ensuring consistent and robust test environment setup.

pkg/testing/envtest_setup.go (1)

54-56: LGTM – Gateway API added to test scheme

gwapiv1.Install correctly registers Gateway-API types with the global scheme; tests that rely on HTTPRoute objects will now serialise without panicking.

pkg/apis/serving/v1alpha1/llm_inference_service_types.go (1)

24-26: Import-alias switch is consistent with the codebase

The new alias gwapiv1 matches the convention used elsewhere (e.g., controllers and tests). No further action required.

pkg/controller/v1beta1/inferenceservice/controller.go (3)

51-51: LGTM: Import alias standardization.

The Gateway API import alias change from gatewayapiv1 to gwapiv1 aligns with the broader codebase standardization mentioned in the AI summary.


316-320: LGTM: Improved HTTPRoute reconciler error handling.

The enhanced handling properly captures both reconciliation result and error, with appropriate early returns for requeue scenarios. This is a solid improvement over the previous implementation.


584-590: LGTM: Consistent Gateway API alias usage.

The SetupWithManager method correctly uses the standardized gwapiv1 alias for Gateway API CRD availability checks and ownership references.

llmisvc-controller.Dockerfile (2)

17-22: LGTM: Comprehensive license compliance checks.

The license checking implementation is thorough, including forbidden license detection and third-party license preservation. This ensures compliance with open source requirements.


25-28: LGTM: Security-focused final image.

Using distroless static nonroot image is a security best practice that minimizes attack surface while maintaining functionality.

pkg/testing/utils.go (1)

25-53: LGTM: Well-implemented project root detection utility.

The function correctly traverses the directory hierarchy to find the project root and appropriately uses panic for fatal configuration errors in test environments. This utility will improve test robustness by replacing hardcoded relative paths.

charts/llmisvc-crd/Chart.yaml (1)

1-22: LGTM: Well-structured Helm chart metadata.

The chart definition follows standard Helm practices with appropriate keywords, maintainer information, and metadata for the LLMInferenceService CRDs.

charts/llmisvc-resources/templates/service.yaml (1)

1-33: LGTM: Well-structured Kubernetes Service template.

The service template follows standard Kubernetes and Helm practices with proper templating, labeling, and configuration. The static analysis hints appear to be false positives due to YAMLlint not understanding Helm template syntax.

Makefile (1)

129-136: LGTM! Proper integration of llmisvc component.

The addition follows the established pattern for generating RBAC and CRD manifests for new components, correctly integrating the llmisvc controller into the build process.

go.mod (1)

67-173: LGTM! Appropriate indirect dependencies for new functionality.

The added dependencies support CEL expression evaluation, semantic versioning, retry logic, OpenTelemetry observability, and enhanced Kubernetes networking capabilities. These align well with the new LLMInferenceService controller features.

charts/llmisvc-resources/Chart.yaml (1)

1-24: LGTM! Well-structured Helm chart metadata.

The Chart.yaml follows Helm best practices with proper versioning, descriptive keywords, maintainer information, and appropriate categorization for AI/Machine Learning use cases.

charts/llmisvc-resources/README.md (1)

1-55: LGTM! Comprehensive Helm chart documentation.

The README provides clear installation instructions and thorough documentation of all configurable values. The command example in line 12 correctly shows the Helm install syntax without needing to show output.

python/kserve/kserve/storage/storage.py (1)

214-233: LGTM! Improved CA bundle handling logic.

The refactoring enhances code readability and maintainability by:

  • Establishing clear precedence (AWS_CA_BUNDLE over ConfigMap)
  • Using os.path.join for cross-platform path construction
  • Adding descriptive variable names (ca_bundle_set)
  • Simplifying the conditional flow

The logic correctly prioritizes environment-specific CA bundle settings while maintaining backward compatibility.

charts/llmisvc-resources/values.schema.json (1)

1-99: LGTM! Well-structured JSON schema.

The schema properly defines validation rules for the Helm chart values with appropriate types, constraints, and descriptions.

charts/llmisvc-resources/templates/deployment.yaml (1)

7-11: Fix YAML syntax errors in the deployment template.

The template has critical syntax issues:

  • Line 8: Remove the stray dash character
  • Line 9: Fix indentation (should be 2 spaces, not 4)

Apply this diff to fix the syntax errors:

   labels:
-    {{- include "llm-isvc-resources.labels" . | nindent 4 }}
+    {{- include "llm-isvc-resources.labels" . | nindent 4 }}
     app.kubernetes.io/component: controller
⛔ Skipped due to learnings
Learnt from: bartoszmajsak
PR: opendatahub-io/kserve#729
File: test/overlays/llm-istio-experimental/istiod.yaml:46-65
Timestamp: 2025-07-10T14:58:54.997Z
Learning: The file `test/overlays/llm-istio-experimental/istiod.yaml` is considered external and irrelevant for review in this repository.
Learnt from: bartoszmajsak
PR: opendatahub-io/kserve#729
File: test/overlays/llm-istio-experimental/istiod.yaml:2025-2049
Timestamp: 2025-07-10T14:58:43.762Z
Learning: The file `test/overlays/llm-istio-experimental/istiod.yaml` is considered external and irrelevant for review in this repository.
Learnt from: bartoszmajsak
PR: opendatahub-io/kserve#729
File: test/overlays/llm-istio-experimental/istio_base.yaml:6854-6869
Timestamp: 2025-07-10T14:58:14.161Z
Learning: The file `test/overlays/llm-istio-experimental/istio_base.yaml` is considered external and irrelevant for review in this repository.
Learnt from: bartoszmajsak
PR: opendatahub-io/kserve#729
File: test/overlays/llm-istio-experimental/istiod.yaml:2559-2566
Timestamp: 2025-07-10T14:58:38.489Z
Learning: In the kserve repository, files in `test/overlays/llm-istio-experimental/` directory are considered external files by the maintainers and should not be flagged for YAML formatting issues or other code quality concerns.
charts/llmisvc-resources/values.yaml (1)

1-200: LGTM! Comprehensive and secure default values.

The values file provides well-structured defaults with:

  • Proper security contexts (non-root, read-only filesystem)
  • Conservative resource allocation
  • Appropriate health check configurations
  • Good use of YAML anchors for version management
pkg/controller/v1beta1/inferenceservice/reconcilers/ingress/httproute_reconciler.go (1)

777-856: Well-structured refactoring of HTTPRoute status reconciliation.

The new reconcileHTTPRouteStatus method effectively centralizes readiness checks for all HTTPRoute components, improving code organization and maintainability. The logic properly handles missing routes and not-ready states with appropriate requeue behavior.

pkg/controller/v1alpha1/llmisvc/config_merge.go (2)

61-76: Well-structured merge function

The implementation correctly handles edge cases and maintains clear precedence order through sequential merging.


78-120: Excellent implementation of Kubernetes-aware merging

The use of strategic merge patch with the zero-value comparison technique is a sophisticated approach that correctly handles Kubernetes object merging semantics. This prevents unintended overwrites from zero-valued fields.

cmd/llmisvc/main.go (3)

44-48: Proper scheme initialization

Good use of utilruntime.Must for init-time scheme registration, which is the standard pattern for controller initialization.


50-70: Security-conscious default configuration

Excellent defaults with metrics secured by default and HTTP/2 disabled for security. The struct is well-organized with clear field names.


137-143: Incorrect Review Comment: LLMISVCReconciler has no Config field

The LLMISVCReconciler struct in pkg/controller/v1alpha1/llmisvc/controller.go is defined as:

type LLMISVCReconciler struct {
    client.Client
    Scheme *runtime.Scheme
}

There is no Config field to initialize. Please disregard the suggestion to add Config to its setup.

Likely an incorrect or invalid review comment.

pkg/controller/v1beta1/inferenceservice/reconcilers/ingress/httproute_reconciler_test.go (4)

45-69: Well-designed test interceptor pattern

The httpRouteClientInterceptor is a clean implementation for simulating specific error conditions in tests. Good documentation explaining its purpose.


39-39: Consistent Gateway API import alias

Good standardization of the Gateway API import alias to gwapiv1, improving codebase consistency.


362-446: Comprehensive test coverage for HTTPRoute readiness checks

Excellent table-driven tests covering all possible HTTPRoute status conditions including edge cases like missing parent status, multiple parents, and various failure reasons.


2466-3258: Thorough reconciler tests with proper requeue validation

The new test cases effectively validate the reconciler's behavior for various HTTPRoute status conditions, properly asserting both the IngressReady condition and the requeue behavior. Particularly good coverage of the edge case where HTTPRoutes are created but then not found (lines 3206-3258).

python/kserve/kserve/storage/test/test_s3_storage.py (6)

383-418: LGTM!

The test correctly verifies that AWS_CA_BUNDLE can be used independently without ConfigMap environment variables. Good use of try/finally for cleanup.


420-451: LGTM!

The test properly validates the ConfigMap CA bundle scenario when AWS_CA_BUNDLE is not set. Good use of tempfile.TemporaryDirectory() context manager for automatic cleanup.


453-500: LGTM!

The test correctly validates that AWS_CA_BUNDLE takes precedence over ConfigMap CA bundle when both are set. Good approach using different certificate contents to verify which one is actually used.


525-549: LGTM!

The test correctly validates that a RuntimeError is raised when the ConfigMap CA bundle file doesn't exist at the expected path.


551-574: LGTM!

The test correctly validates that CA bundle file existence is not checked when S3_VERIFY_SSL is set to false, and that boto3.resource is called with verify=False.


576-607: LGTM!

The test correctly validates that an empty AWS_CA_BUNDLE falls back to using the ConfigMap CA bundle. This is an important edge case to test.

python/storage/test/test_s3_storage.py (1)

383-607: Test content duplicates python/kserve/kserve/storage/test/test_s3_storage.py

This file contains identical test functions as the previous file, with only the import statement differing (kserve_storage vs kserve.storage). Consider whether this duplication is intentional for testing different package distributions.

pkg/controller/v1alpha1/llmisvc/config_merge_test.go (3)

35-1095: Excellent test coverage for MergeSpecs!

The test cases comprehensively cover various merge scenarios including:

  • Deep merging of nested structures (PodSpec, containers, resources)
  • Handling of nil pointers vs zero values
  • Merge behavior for slices (replacement vs merging by key)
  • Complex nested structure merging
  • Edge cases with empty structures

The table-driven test approach with clear test names makes it easy to understand what each test validates.


1097-1516: Comprehensive test coverage for template variable replacement!

The test cases thoroughly cover:

  • Basic variable substitution (.Name, .Namespace, .Spec.Model.Name)
  • Custom template functions (ChildName)
  • Complex nested field access
  • Multiple variables in single string
  • Error cases (invalid syntax, non-existent fields)
  • Conditional logic with nil pointer handling

Good attention to edge cases and error scenarios.


1518-1524: LGTM!

Appropriate test helper for creating URLs in test cases where parse errors indicate test setup issues.

Comment thread .github/workflows/go.yml
Comment thread .github/workflows/go.yml
Comment thread charts/llmisvc-resources/templates/_helpers.tpl
Comment thread charts/llmisvc-resources/templates/serviceaccount.yaml
Comment thread llmisvc-controller.Dockerfile
Comment thread pkg/controller/v1alpha1/llmisvc/config_merge.go
Comment thread python/custom_transformer/model.py
@mholder6 mholder6 added the tide/merge-method-merge Change lgtm to use a merge commit label Jul 31, 2025
Copy link
Copy Markdown
Member

@spolti spolti left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

/lgtm

@openshift-ci
Copy link
Copy Markdown

openshift-ci Bot commented Jul 31, 2025

[APPROVALNOTIFIER] This PR is APPROVED

This pull-request has been approved by: brettmthompson, mholder6, spolti

The full list of commands accepted by this bot can be found here.

The pull request process is described here

Details Needs approval from an approver in each of these files:
  • OWNERS [brettmthompson,mholder6,spolti]

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

@openshift-merge-bot openshift-merge-bot Bot merged commit ec9b68c into opendatahub-io:master Jul 31, 2025
32 of 37 checks passed
@github-project-automation github-project-automation Bot moved this from New/Backlog to Done in ODH Model Serving Planning Jul 31, 2025
@mholder6 mholder6 deleted the 250731_sync_upstream_batch1 branch August 6, 2025 15:36
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

approved lgtm tide/merge-method-merge Change lgtm to use a merge commit

Projects

Status: Done

Development

Successfully merging this pull request may close these issues.

9 participants