[WIP] test(e2e): introduces basic e2e test for a sample single node model by bartoszmajsak · Pull Request #736 · opendatahub-io/kserve

bartoszmajsak · 2025-07-11T15:01:12Z

Important

First commit, fixing Python packages can be upstreamed separately.

How to run

Follow DEV.md to set up KinD, create Gateway and Destination Rule
resources.

Set up python venv

python -m venv venv
source venv/bin/activate
cd python/kserve
pip install -e .
pip install pytest pytest-asyncio requests
cd -

Execute the test

kind get kubeconfig --name kserve-llm-d > kubeconfig.e2e

KUBECONFIG=kubeconfig.e2e pytest test/e2e/llmisvc/test_llm_inference_service.py::test_llm_inference_service_facebook_opt -v

TODO

fix wait/watch
improve assertions (it's now checking 200 OK only)
automate setup (gateway and destination rule)
include as github action
expand test scenarios

Summary by CodeRabbit

New Features
- Added support for new LLM Inference Service model classes, making them available in the public API.
- Introduced an end-to-end test for LLM Inference Service, verifying deployment, readiness, and inference response using the Facebook OPT-125M model.
Tests
- Implemented comprehensive integration tests for LLM Inference Service, including resource creation, status monitoring, endpoint invocation, and cleanup.
- Added a new pytest marker for LLM Inference Service end-to-end tests.

coderabbitai · 2025-07-11T15:01:19Z

Important

Review skipped

Auto reviews are disabled on base/target branches other than the default branch.

Please check the settings in the CodeRabbit UI or the .coderabbit.yaml file in this repository. To trigger a single review, invoke the @coderabbitai review command.

You can disable this status message by setting the reviews.review_status to false in the CodeRabbit configuration file.

"""

Walkthrough

The changes introduce new LLM inference service model imports to the kserve package's public API, add an end-to-end test for KServe's LLMInferenceService using the Facebook OPT-125M model, and register a new pytest marker for these tests. No existing logic or control flow is altered.

Changes

File(s)	Change Summary
python/kserve/kserve/init.py	Added imports for four v1alpha1 LLMInferenceService-related model classes to the package namespace.
test/e2e/llmisvc/test_llm_inference_service.py	Added comprehensive async end-to-end test and helper functions for LLMInferenceService deployment, readiness, and inference validation.
test/e2e/pytest.ini	Added new pytest marker `llminferenceservice` for LLM inference service controller e2e tests.

Sequence Diagram(s)

sequenceDiagram
    participant Tester
    participant KServeClient
    participant KubernetesAPI
    participant LLMInferenceService

    Tester->>KServeClient: Create LLMInferenceService CR
    KServeClient->>KubernetesAPI: POST CR to cluster
    Tester->>KServeClient: Wait for readiness (watch status)
    KServeClient->>KubernetesAPI: GET/Watch CR status
    KubernetesAPI-->>KServeClient: Status updates
    KServeClient-->>Tester: Ready status
    Tester->>LLMInferenceService: Send inference HTTP request
    LLMInferenceService-->>Tester: Return inference result
    Tester->>KServeClient: Delete LLMInferenceService CR
    KServeClient->>KubernetesAPI: DELETE CR

Poem

In the garden of code, a new seed’s been sown,
LLM services now stand, fully grown.
With tests that await the models to rise,
And markers to help them organize.
🐇 Hopping with joy, I watch them deploy—
Inferences flourish, oh what a ploy!

"""

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

🪧 Tips

Chat

There are 3 ways to chat with CodeRabbit:

Review comments: Directly reply to a review comment made by CodeRabbit. Example:
- I pushed a fix in commit <commit_id>, please review it.
- Explain this complex logic.
- Open a follow-up GitHub issue for this discussion.
Files and specific lines of code (under the "Files changed" tab): Tag @coderabbitai in a new review comment at the desired location with your query. Examples:
- @coderabbitai explain this code block.
- @coderabbitai modularize this function.
PR comments: Tag @coderabbitai in a new PR comment to ask questions about the PR branch. For the best results, please provide a very specific query, as very limited context is provided in this mode. Examples:
- @coderabbitai gather interesting stats about this repository and render them as a table. Additionally, render a pie chart showing the language distribution in the codebase.
- @coderabbitai read src/utils.ts and explain its main purpose.
- @coderabbitai read the files in the src/scheduler package and generate a class diagram using mermaid and a README in the markdown format.
- @coderabbitai help me debug CodeRabbit configuration file.

Support

Need help? Create a ticket on our support page for assistance with any issues or questions.

Note: Be mindful of the bot's finite context window. It's strongly recommended to break down tasks such as reading entire modules into smaller chunks. For a focused discussion, use review comments to chat about specific files and their changes, instead of using the PR comments.

CodeRabbit Commands (Invoked using PR comments)

@coderabbitai pause to pause the reviews on a PR.
@coderabbitai resume to resume the paused reviews.
@coderabbitai review to trigger an incremental review. This is useful when automatic reviews are disabled for the repository.
@coderabbitai full review to do a full review from scratch and review all the files again.
@coderabbitai summary to regenerate the summary of the PR.
@coderabbitai generate docstrings to generate docstrings for this PR.
@coderabbitai generate sequence diagram to generate a sequence diagram of the changes in this PR.
@coderabbitai resolve resolve all the CodeRabbit review comments.
@coderabbitai configuration to show the current CodeRabbit configuration for the repository.
@coderabbitai help to get help.

Other keywords and placeholders

Add @coderabbitai ignore anywhere in the PR description to prevent this PR from being reviewed.
Add @coderabbitai summary to generate the high-level summary at a specific location in the PR description.
Add @coderabbitai anywhere in the PR title to generate the title automatically.

CodeRabbit Configuration File (`.coderabbit.yaml`)

You can programmatically configure CodeRabbit by adding a .coderabbit.yaml file to the root of your repository.
Please see the configuration documentation for more information.
If your editor has YAML language server enabled, you can add the path at the top of this file to enable auto-completion and validation: # yaml-language-server: $schema=https://coderabbit.ai/integrations/schema.v2.json

Documentation and Community

Visit our Documentation for detailed information on how to use CodeRabbit.
Join our Discord Community to get help, request features, and share feedback.
Follow us on X/Twitter for updates and announcements.

openshift-ci · 2025-07-11T15:01:20Z

[APPROVALNOTIFIER] This PR is NOT APPROVED

This pull-request has been approved by: bartoszmajsak
Once this PR has been reviewed and has the lgtm label, please assign terrytangyuan for approval. For more information see the Code Review Process.

The full list of commands accepted by this bot can be found here.

Details

Needs approval from an approver in each of these files:

OWNERS

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

bartoszmajsak · 2025-07-11T17:41:38Z

@coderabbitai review

coderabbitai · 2025-07-11T17:45:38Z

✅ Actions performed

Review triggered.

Note: CodeRabbit is an incremental review system and does not re-review already reviewed commits. This command is applicable only when automatic reviews are paused.

coderabbitai

Actionable comments posted: 6

🧹 Nitpick comments (4)

python/kserve/kserve/__init__.py (1)

52-55: LGTM! New LLMInferenceService model imports follow the existing pattern.

The addition of these v1alpha1 LLMInferenceService model classes to the package's public API is consistent with how other models are exposed.

Consider adding an __all__ list to this file in a future PR to explicitly define the public API and improve documentation/autocomplete support.

test/e2e/llmisvc/test_llm_inference_service.py (3)

90-106: Consider enhancing test assertions.

The test currently only validates HTTP 200 status and presence of "choices" field. As noted in the PR objectives TODO, more thorough assertions would improve test coverage.

Consider validating additional response fields such as:

Response structure matches expected completion format

Model name in response matches requested model

Generated text content is not empty

Response time is within acceptable limits

Would you like me to generate enhanced assertions for the test?

115-154: Consider moving LLMInferenceService operations to KServeClient.

The TODO comment correctly identifies that these helper functions would be better placed in the KServeClient class for reusability across tests and applications.

Moving these functions to KServeClient would:

Provide a consistent API for all KServe resource types

Enable reuse in other tests and client applications

Follow the existing pattern for InferenceService operations

Would you like me to create an issue to track this architectural improvement?

36-101: Acknowledge known limitations and future improvements.

Based on the PR objectives, this test has known limitations including:

Flaky wait conditions (currently being addressed)

Manual gateway/destination rule setup required

Basic assertions (HTTP 200 only)

The implementation provides a solid foundation for e2e testing of LLMInferenceService.

For the TODO items mentioned in the PR description (automating gateway setup, enhancing assertions, GitHub action integration), would you like me to help create tracking issues or provide implementation suggestions?

📜 Review details

Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between 947e449 and 8694f08.

📒 Files selected for processing (3)

python/kserve/kserve/__init__.py (1 hunks)
test/e2e/llmisvc/test_llm_inference_service.py (1 hunks)
test/e2e/pytest.ini (1 hunks)

🧰 Additional context used

🧠 Learnings (3)

📓 Common learnings

Learnt from: bartoszmajsak
PR: opendatahub-io/kserve#729
File: test/overlays/llm-istio-experimental/istiod.yaml:46-65
Timestamp: 2025-07-10T14:58:54.997Z
Learning: The file `test/overlays/llm-istio-experimental/istiod.yaml` is considered external and irrelevant for review in this repository.

Learnt from: bartoszmajsak
PR: opendatahub-io/kserve#729
File: test/overlays/llm-istio-experimental/istiod.yaml:2025-2049
Timestamp: 2025-07-10T14:58:43.762Z
Learning: The file `test/overlays/llm-istio-experimental/istiod.yaml` is considered external and irrelevant for review in this repository.

Learnt from: bartoszmajsak
PR: opendatahub-io/kserve#729
File: test/overlays/llm-istio-experimental/istio_base.yaml:6854-6869
Timestamp: 2025-07-10T14:58:14.161Z
Learning: The file `test/overlays/llm-istio-experimental/istio_base.yaml` is considered external and irrelevant for review in this repository.

Learnt from: hdefazio
PR: opendatahub-io/kserve#658
File: pkg/controller/v1beta1/inferenceservice/rawkube_controller_test.go:1873-1886
Timestamp: 2025-06-12T22:10:15.375Z
Learning: If a review comment is raised on code that was not introduced or changed by the current PR, and the user points this out, acknowledge it and clarify that no action is required from the PR author.

Learnt from: bartoszmajsak
PR: opendatahub-io/kserve#729
File: test/overlays/llm-istio-experimental/istiod.yaml:2559-2566
Timestamp: 2025-07-10T14:58:38.489Z
Learning: In the kserve repository, files in `test/overlays/llm-istio-experimental/` directory are considered external files by the maintainers and should not be flagged for YAML formatting issues or other code quality concerns.

test/e2e/pytest.ini (3)

Learnt from: bartoszmajsak
PR: opendatahub-io/kserve#729
File: test/overlays/llm-istio-experimental/istiod.yaml:2559-2566
Timestamp: 2025-07-10T14:58:38.489Z
Learning: In the kserve repository, files in `test/overlays/llm-istio-experimental/` directory are considered external files by the maintainers and should not be flagged for YAML formatting issues or other code quality concerns.

Learnt from: bartoszmajsak
PR: opendatahub-io/kserve#729
File: test/overlays/llm-istio-experimental/istiod.yaml:2025-2049
Timestamp: 2025-07-10T14:58:43.762Z
Learning: The file `test/overlays/llm-istio-experimental/istiod.yaml` is considered external and irrelevant for review in this repository.

Learnt from: bartoszmajsak
PR: opendatahub-io/kserve#729
File: test/overlays/llm-istio-experimental/istiod.yaml:46-65
Timestamp: 2025-07-10T14:58:54.997Z
Learning: The file `test/overlays/llm-istio-experimental/istiod.yaml` is considered external and irrelevant for review in this repository.

test/e2e/llmisvc/test_llm_inference_service.py (1)

Learnt from: bartoszmajsak
PR: opendatahub-io/kserve#729
File: test/overlays/llm-istio-experimental/istiod.yaml:2559-2566
Timestamp: 2025-07-10T14:58:38.489Z
Learning: In the kserve repository, files in `test/overlays/llm-istio-experimental/` directory are considered external files by the maintainers and should not be flagged for YAML formatting issues or other code quality concerns.

🧬 Code Graph Analysis (1)

test/e2e/llmisvc/test_llm_inference_service.py (4)

python/kserve/kserve/api/kserve_client.py (2)

KServeClient (31-834)

get (166-223)

python/kserve/kserve/models/v1alpha1_llm_inference_service.py (1)

V1alpha1LLMInferenceService (35-242)

python/kserve/kserve/utils/utils.py (1)

get_isvc_namespace (49-50)

python/kserve/kserve/exceptions.py (1)

ApiException (98-123)

🪛 Ruff (0.11.9)

python/kserve/kserve/__init__.py

52-52: .models.v1alpha1_llm_inference_service.V1alpha1LLMInferenceService imported but unused; consider removing, adding to __all__, or using a redundant alias

(F401)

53-53: .models.v1alpha1_llm_inference_service_config.V1alpha1LLMInferenceServiceConfig imported but unused; consider removing, adding to __all__, or using a redundant alias

(F401)

54-54: .models.v1alpha1_llm_inference_service_config_list.V1alpha1LLMInferenceServiceConfigList imported but unused; consider removing, adding to __all__, or using a redundant alias

(F401)

55-55: .models.v1alpha1_llm_inference_service_list.V1alpha1LLMInferenceServiceList imported but unused; consider removing, adding to __all__, or using a redundant alias

(F401)

test/e2e/llmisvc/test_llm_inference_service.py

15-15: asyncio imported but unused

Remove unused import: asyncio

(F401)

16-16: json imported but unused

Remove unused import: json

(F401)

21-21: kserve.constants imported but unused

Remove unused import: kserve.constants

(F401)

143-145: Within an except clause, raise exceptions with raise ... from err or raise ... from None to distinguish them from errors in exception handling

(B904)

175-177: Within an except clause, raise exceptions with raise ... from err or raise ... from None to distinguish them from errors in exception handling

(B904)

197-199: Within an except clause, raise exceptions with raise ... from err or raise ... from None to distinguish them from errors in exception handling

(B904)

245-245: Within an except clause, raise exceptions with raise ... from err or raise ... from None to distinguish them from errors in exception handling

(B904)

🪛 Flake8 (7.2.0)

test/e2e/llmisvc/test_llm_inference_service.py

[error] 15-15: 'asyncio' imported but unused

(F401)

[error] 16-16: 'json' imported but unused

(F401)

[error] 21-21: 'kserve.constants' imported but unused

(F401)

[error] 312-312: visually indented line with same indent as next logical line

(E129)

🪛 GitHub Actions: Precommit Check

test/e2e/llmisvc/test_llm_inference_service.py

[error] 15-15: flake8: 'asyncio' imported but unused (F401)

[error] 16-16: flake8: 'json' imported but unused (F401)

[error] 21-21: flake8: 'kserve.constants' imported but unused (F401)

[error] 108-108: flake8: too many leading '#' for block comment (E266)

🔇 Additional comments (1)

test/e2e/pytest.ini (1)

20-20: LGTM! New pytest marker for LLM inference service tests.

The marker follows the existing naming convention and will enable selective execution of LLM inference service e2e tests.

coderabbitai · 2025-07-11T17:53:28Z

+        raise RuntimeError(
+            f"Exception when calling CustomObjectsApi->create_namespaced_custom_object for LLMInferenceService: {e}"
+        )


🛠️ Refactor suggestion

Improve exception handling by preserving the exception chain.

Use raise ... from e to maintain the exception chain for better debugging and error tracing.

Apply these diffs to fix exception handling:

In create_llmisvc:

except client.rest.ApiException as e: raise RuntimeError( f"Exception when calling CustomObjectsApi->create_namespaced_custom_object for LLMInferenceService: {e}" - ) + ) from e

In delete_llmisvc:

except client.rest.ApiException as e: raise RuntimeError( f"Exception when calling CustomObjectsApi->delete_namespaced_custom_object for LLMInferenceService: {e}" - ) + ) from e

In get_llmisvc:

except client.rest.ApiException as e: raise RuntimeError( f"Exception when calling CustomObjectsApi->get_namespaced_custom_object for LLMInferenceService: {e}" - ) + ) from e

In get_llm_service_url:

except Exception as e: - raise ValueError(f"Failed to get URL for LLM inference service {service_name}: {e}") + raise ValueError(f"Failed to get URL for LLM inference service {service_name}: {e}") from e

Also applies to: 175-177, 197-199, 245-245

🧰 Tools

🪛 Ruff (0.11.9)

143-145: Within an except clause, raise exceptions with raise ... from err or raise ... from None to distinguish them from errors in exception handling

(B904)

🤖 Prompt for AI Agents

In test/e2e/llmisvc/test_llm_inference_service.py at lines 143-145 and similarly at lines 175-177, 197-199, and 245, improve exception handling by changing the raise statements to use "raise ... from e" syntax. This preserves the original exception context and stack trace. Locate each raise RuntimeError statement that includes the exception variable e and modify it to "raise RuntimeError(...) from e" to maintain the exception chain for better debugging.

coderabbitai · 2025-07-11T17:53:28Z

+async def wait_llm_isvc_ready(
+    kserve_client, 
+    name, 
+    namespace, 
+    timeout_seconds=600,
+    version=KSERVE_V1ALPHA1_VERSION
+):
+    """
+    Wait for LLM inference service to be ready by checking all conditions
+    :param kserve_client: KServe client instance
+    :param name: LLM inference service name
+    :param namespace: namespace
+    :param timeout_seconds: timeout seconds for waiting, default to 600s
+    :param version: api group version
+    :return: ready LLM inference service object
+    """
+    llm_isvc_watch(name=name, namespace=namespace, timeout_seconds=timeout_seconds)
+
+    llm_isvc = get_llmisvc(kserve_client, name, namespace, version)
+
+    return llm_isvc
+


🛠️ Refactor suggestion

Remove unnecessary async declaration or implement true async waiting.

The wait_llm_isvc_ready function is declared as async but doesn't perform any asynchronous operations. It calls the synchronous llm_isvc_watch function.

Either remove the async declaration:

-async def wait_llm_isvc_ready( +def wait_llm_isvc_ready( kserve_client, name, namespace, timeout_seconds=600, version=KSERVE_V1ALPHA1_VERSION ):

And update the test to not await it:

- await wait_llm_isvc_ready(kserve_client, service_name, KSERVE_TEST_NAMESPACE) + wait_llm_isvc_ready(kserve_client, service_name, KSERVE_TEST_NAMESPACE)

Or implement true async waiting using asyncio if needed for concurrent operations.

Committable suggestion skipped: line range outside the PR's diff.

🤖 Prompt for AI Agents

In test/e2e/llmisvc/test_llm_inference_service.py around lines 202 to 223, the function wait_llm_isvc_ready is declared async but only calls synchronous functions without awaiting any async operations. To fix this, either remove the async keyword from the function definition and update all calls to it to not use await, or refactor the function to perform actual asynchronous waiting using asyncio constructs if concurrent async behavior is required.

coderabbitai · 2025-07-11T17:53:28Z

+    finally:
+        try:
+            print(f"{service_url}/v1/completions")
+            delete_llmisvc(kserve_client, service_name, KSERVE_TEST_NAMESPACE)
+        except Exception as e:
+            print(f"Warning: Failed to cleanup service {service_name}: {e}")


🛠️ Refactor suggestion

Move debug print statement before cleanup.

The print statement is currently in the finally block after the service might be deleted, which could be confusing for debugging.

Apply this diff to improve the code structure:

assert response.status_code == 200, f"Expected 200 but got {response.status_code}: {response.text}" response_data = response.json() assert "choices" in response_data, "Response should contain 'choices' field" + print(f"Successfully tested endpoint: {completion_url}") + finally: try: - print(f"{service_url}/v1/completions") delete_llmisvc(kserve_client, service_name, KSERVE_TEST_NAMESPACE)

📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change

finally:

try:

print(f"{service_url}/v1/completions")

delete_llmisvc(kserve_client, service_name, KSERVE_TEST_NAMESPACE)

except Exception as e:

print(f"Warning: Failed to cleanup service {service_name}: {e}")

assert response.status_code == 200, f"Expected 200 but got {response.status_code}: {response.text}"

response_data = response.json()

assert "choices" in response_data, "Response should contain 'choices' field"

print(f"Successfully tested endpoint: {completion_url}")

finally:

try:

delete_llmisvc(kserve_client, service_name, KSERVE_TEST_NAMESPACE)

except Exception as e:

print(f"Warning: Failed to cleanup service {service_name}: {e}")

🧰 Tools

🪛 GitHub Actions: Precommit Check

[error] 108-108: flake8: too many leading '#' for block comment (E266)

🤖 Prompt for AI Agents

In test/e2e/llmisvc/test_llm_inference_service.py around lines 108 to 113, move the debug print statement that outputs the service URL to before the call to delete_llmisvc. This ensures the URL is printed before the service deletion attempt, improving clarity during debugging. Keep the try-except structure to catch and log any exceptions during cleanup.

coderabbitai · 2025-07-11T17:53:29Z

+                if (ready == "True" and 
+                    workloads_ready == "True" and 
+                    router_ready == "True" and
+                    main_workload_ready == "True"):
+                    print("All conditions ready")


⚠️ Potential issue

Fix indentation issue in multi-line condition.

The condition continuation has incorrect indentation causing a flake8 E129 error.

Apply this diff to fix the indentation:

if (ready == "True" and workloads_ready == "True" and - router_ready == "True" and - main_workload_ready == "True"): + router_ready == "True" and + main_workload_ready == "True"): print("All conditions ready")

📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change

if (ready == "True" and

workloads_ready == "True" and

router_ready == "True" and

main_workload_ready == "True"):

print("All conditions ready")

if (ready == "True" and

workloads_ready == "True" and

router_ready == "True" and

main_workload_ready == "True"):

print("All conditions ready")

🧰 Tools

🪛 Flake8 (7.2.0)

[error] 312-312: visually indented line with same indent as next logical line

(E129)

🤖 Prompt for AI Agents

In test/e2e/llmisvc/test_llm_inference_service.py around lines 309 to 313, the multi-line if condition has incorrect indentation causing a flake8 E129 error. Fix this by aligning the continued lines properly under the opening parenthesis, ensuring consistent indentation for each condition line to comply with style guidelines.

Jooho

Follow DEV.md to set up KinD, create Gateway and Destination Rule

Shouldn't the e2e test include creating this Gateway and Destination Rule?

Jooho · 2025-07-11T18:02:14Z

+            "template": {
+                "containers": [{
+                    "name": "main",
+                    "image": "quay.io/pierdipi/vllm-cpu:latest",


Would it be okay to use images from a personal repository for testing purposes?

Yes, it's wip. We should use official one for the real thing, but I think that very image was not published lately and the latest version was buggy, that's why we published newer one. FWIW that's not any different to what we have in sample and what we used in the demo.

But I agree - we should address this problem.

/cc @pierDipi

maybe a quick improvement for now is that we could push that same image to the opendatahub Quay org to avoid referencing mine but it's ok to stay on that personal one for the purpose of this PR (perhaps extract a constant for quick future edits)

bartoszmajsak · 2025-07-11T20:12:45Z

Follow DEV.md to set up KinD, create Gateway and Destination Rule

Shouldn't the e2e test include creating this Gateway and Destination Rule?

It's part of PRs TODOs :)

automate setup (gateway and destination rule)

bartoszmajsak · 2025-07-13T21:53:15Z

@coderabbitai review

coderabbitai · 2025-07-13T21:53:20Z

✅ Actions performed

Review triggered.

Note: CodeRabbit is an incremental review system and does not re-review already reviewed commits. This command is applicable only when automatic reviews are paused.

coderabbitai

Actionable comments posted: 1

♻️ Duplicate comments (4)

test/e2e/llmisvc/test_llm_inference_service.py (4)

113-115: Fix exception handling by preserving the exception chain.

Use raise ... from e to maintain the exception chain for better debugging and error tracing.

138-140: Fix exception handling by preserving the exception chain.

Use raise ... from e to maintain the exception chain for better debugging and error tracing.

163-165: Fix exception handling by preserving the exception chain.

Use raise ... from e to maintain the exception chain for better debugging and error tracing.

258-260: Fix exception handling by preserving the exception chain.

Use raise ... from e to maintain the exception chain for better debugging and error tracing.

🧹 Nitpick comments (2)

test/e2e/llmisvc/test_llm_inference_service.py (2)
50-50: Consider using official container images for production tests.

The test uses a personal repository image (quay.io/pierdipi/vllm-cpu:latest). For production-ready tests, consider using official images to ensure consistency and reliability.

Do you want me to help identify official alternatives or create an issue to track this improvement?

213-213: Improve polling efficiency with exponential backoff.

The current polling uses a fixed 10-second interval which may be inefficient. Consider implementing exponential backoff for faster initial checks while avoiding overwhelming the service.

Apply this diff to improve the polling strategy:
-    wait_for(assert_model_responds, timeout=timeout_seconds, interval=10.0)
+    wait_for(assert_model_responds, timeout=timeout_seconds, interval=2.0)
And consider enhancing the wait_for function with exponential backoff:
def wait_for(assertion_fn, timeout: float = 5.0, initial_interval: float = 0.1, max_interval: float = 10.0):
    """
    Repeatedly calls assertion_fn() with exponential backoff until success or timeout.
    """
    deadline = time.time() + timeout
    interval = initial_interval
    
    while True:
        try:
            return assertion_fn()
        except AssertionError:
            if time.time() >= deadline:
                raise
            time.sleep(interval)
            interval = min(interval * 1.5, max_interval)

📜 Review details

Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between 8694f08 and d5d7902.

📒 Files selected for processing (3)

python/kserve/kserve/__init__.py (1 hunks)
test/e2e/llmisvc/test_llm_inference_service.py (1 hunks)
test/e2e/pytest.ini (1 hunks)

✅ Files skipped from review due to trivial changes (1)

test/e2e/pytest.ini

🧰 Additional context used

🧠 Learnings (2)

📓 Common learnings

Learnt from: bartoszmajsak
PR: opendatahub-io/kserve#729
File: test/overlays/llm-istio-experimental/istiod.yaml:2025-2049
Timestamp: 2025-07-10T14:58:43.762Z
Learning: The file `test/overlays/llm-istio-experimental/istiod.yaml` is considered external and irrelevant for review in this repository.

Learnt from: bartoszmajsak
PR: opendatahub-io/kserve#729
File: test/overlays/llm-istio-experimental/istiod.yaml:46-65
Timestamp: 2025-07-10T14:58:54.997Z
Learning: The file `test/overlays/llm-istio-experimental/istiod.yaml` is considered external and irrelevant for review in this repository.

Learnt from: bartoszmajsak
PR: opendatahub-io/kserve#729
File: test/overlays/llm-istio-experimental/istio_base.yaml:6854-6869
Timestamp: 2025-07-10T14:58:14.161Z
Learning: The file `test/overlays/llm-istio-experimental/istio_base.yaml` is considered external and irrelevant for review in this repository.

Learnt from: hdefazio
PR: opendatahub-io/kserve#658
File: pkg/controller/v1beta1/inferenceservice/rawkube_controller_test.go:1873-1886
Timestamp: 2025-06-12T22:10:15.375Z
Learning: If a review comment is raised on code that was not introduced or changed by the current PR, and the user points this out, acknowledge it and clarify that no action is required from the PR author.

Learnt from: bartoszmajsak
PR: opendatahub-io/kserve#729
File: test/overlays/llm-istio-experimental/istiod.yaml:2559-2566
Timestamp: 2025-07-10T14:58:38.489Z
Learning: In the kserve repository, files in `test/overlays/llm-istio-experimental/` directory are considered external files by the maintainers and should not be flagged for YAML formatting issues or other code quality concerns.

test/e2e/llmisvc/test_llm_inference_service.py (1)

Learnt from: bartoszmajsak
PR: opendatahub-io/kserve#729
File: test/overlays/llm-istio-experimental/istiod.yaml:2559-2566
Timestamp: 2025-07-10T14:58:38.489Z
Learning: In the kserve repository, files in `test/overlays/llm-istio-experimental/` directory are considered external files by the maintainers and should not be flagged for YAML formatting issues or other code quality concerns.

🪛 Ruff (0.11.9)

test/e2e/llmisvc/test_llm_inference_service.py

113-115: Within an except clause, raise exceptions with raise ... from err or raise ... from None to distinguish them from errors in exception handling

(B904)

138-140: Within an except clause, raise exceptions with raise ... from err or raise ... from None to distinguish them from errors in exception handling

(B904)

163-165: Within an except clause, raise exceptions with raise ... from err or raise ... from None to distinguish them from errors in exception handling

(B904)

191-191: Within an except clause, raise exceptions with raise ... from err or raise ... from None to distinguish them from errors in exception handling

(B904)

204-204: Within an except clause, raise exceptions with raise ... from err or raise ... from None to distinguish them from errors in exception handling

(B904)

258-260: Within an except clause, raise exceptions with raise ... from err or raise ... from None to distinguish them from errors in exception handling

(B904)

python/kserve/kserve/__init__.py

52-52: .models.v1alpha1_llm_inference_service.V1alpha1LLMInferenceService imported but unused; consider removing, adding to __all__, or using a redundant alias

(F401)

54-54: .models.v1alpha1_llm_inference_service_config.V1alpha1LLMInferenceServiceConfig imported but unused; consider removing, adding to __all__, or using a redundant alias

(F401)

57-57: .models.v1alpha1_llm_inference_service_config_list.V1alpha1LLMInferenceServiceConfigList imported but unused; consider removing, adding to __all__, or using a redundant alias

(F401)

59-59: .models.v1alpha1_llm_inference_service_list.V1alpha1LLMInferenceServiceList imported but unused; consider removing, adding to __all__, or using a redundant alias

(F401)

⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (7)

GitHub Check: precommit-check
GitHub Check: build (3.9)
GitHub Check: build (3.11)
GitHub Check: build (3.12)
GitHub Check: test
GitHub Check: build (3.10)
GitHub Check: Build

🔇 Additional comments (1)

python/kserve/kserve/__init__.py (1)

52-59: LGTM! New LLM inference service models properly exposed in public API.

The new imports correctly expose the LLM inference service model classes in the package's public API, following the established pattern. The static analysis warnings about unused imports are false positives - these imports are intended for external consumption.

coderabbitai · 2025-07-13T21:57:27Z

+@pytest.mark.llminferenceservice
+@pytest.mark.asyncio(scope="session")
+async def test_llm_inference_service_facebook_opt():


🛠️ Refactor suggestion

Remove unnecessary async marker or add async operations.

The test function is marked with @pytest.mark.asyncio but doesn't contain any await statements or async operations. This creates unnecessary overhead.

Apply this diff to remove the async marker:

@pytest.mark.llminferenceservice -@pytest.mark.asyncio(scope="session") -async def test_llm_inference_service_facebook_opt(): +def test_llm_inference_service_facebook_opt():

📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change

@pytest.mark.llminferenceservice

@pytest.mark.asyncio(scope="session")

async def test_llm_inference_service_facebook_opt():

@pytest.mark.llminferenceservice

def test_llm_inference_service_facebook_opt():

🤖 Prompt for AI Agents

In test/e2e/llmisvc/test_llm_inference_service.py around lines 31 to 33, the test function is marked as async with @pytest.mark.asyncio but contains no await or async operations. Remove the @pytest.mark.asyncio decorator and the async keyword from the function definition to eliminate unnecessary async overhead.

…odel > [!IMPORTANT] > First commit, fixing Python packages can be upstreamed separately. ### How to run Follow `DEV.md` to set up KinD, create Gateway and Destination Rule resources. #### Set up python venv ```shell python -m venv venv source venv/bin/activate cd python/kserve pip install -e . pip install pytest pytest-asyncio tabulate requests cd - ``` #### Execute the test ```shell kind get kubeconfig --name kserve-llm-d > kubeconfig.e2e KUBECONFIG=kubeconfig.e2e pytest test/e2e/llmisvc/test_llm_inference_service.py::test_llm_inference_service_facebook_opt -v ``` > [!IMPORTANT] > It seems that wait conditions are a bit flaky. If you set everything up before starting the test (and comment out create/delete LLMISVC in the test code), the test passes. ### TODO - make watch funcs a bit less verbose :) - improve assertions (it's now checking `200 OK` only) - automate setup (gateway and destination rule) - include as github action - expand test scenarios Signed-off-by: Bartosz Majsak <bartosz.majsak@gmail.com>

pierDipi

This is great!

I'd love to see the test instructions in the PR body added to DEV doc for future quick reference.

Generally LGTM!

bartoszmajsak · 2025-07-24T17:02:55Z

Closes in favor of #774

fix(deps): adds missing LLMInferenceService types for python

07452a9

bartoszmajsak added the llm-d label Jul 11, 2025

github-project-automation Bot added this to ODH Model Serving Planning Jul 11, 2025

github-project-automation Bot moved this to New/Backlog in ODH Model Serving Planning Jul 11, 2025

openshift-ci Bot added the do-not-merge/work-in-progress label Jul 11, 2025

coderabbitai Bot reviewed Jul 11, 2025

View reviewed changes

Jooho reviewed Jul 11, 2025

View reviewed changes

bartoszmajsak force-pushed the llm/basic-e2e-test branch 2 times, most recently from 1ed9a72 to d5d7902 Compare July 13, 2025 21:49

openshift-ci Bot requested a review from pierDipi July 13, 2025 21:54

coderabbitai Bot reviewed Jul 13, 2025

View reviewed changes

bartoszmajsak force-pushed the llm/basic-e2e-test branch from d5d7902 to 2fa554a Compare July 13, 2025 22:03

pierDipi reviewed Jul 14, 2025

View reviewed changes

bartoszmajsak closed this Jul 24, 2025

github-project-automation Bot moved this from New/Backlog to Done in ODH Model Serving Planning Jul 24, 2025

bartoszmajsak mentioned this pull request Jul 24, 2025

test(e2e): introduces e2e test skeleton #774

Merged

Conversation

bartoszmajsak commented Jul 11, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

How to run

Set up python venv

Execute the test

TODO

Summary by CodeRabbit

Uh oh!

coderabbitai Bot commented Jul 11, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Review skipped

Walkthrough

Changes

Sequence Diagram(s)

Poem

Chat

Support

CodeRabbit Commands (Invoked using PR comments)

Other keywords and placeholders

CodeRabbit Configuration File (.coderabbit.yaml)

Documentation and Community

Uh oh!

openshift-ci Bot commented Jul 11, 2025

Uh oh!

bartoszmajsak commented Jul 11, 2025

Uh oh!

coderabbitai Bot commented Jul 11, 2025

Uh oh!

coderabbitai Bot left a comment

Choose a reason for hiding this comment

Uh oh!

coderabbitai Bot Jul 11, 2025

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

coderabbitai Bot Jul 11, 2025

Choose a reason for hiding this comment

Uh oh!

coderabbitai Bot Jul 11, 2025

Choose a reason for hiding this comment

Uh oh!

coderabbitai Bot Jul 11, 2025

Choose a reason for hiding this comment

Uh oh!

Jooho left a comment

Choose a reason for hiding this comment

Uh oh!

Jooho Jul 11, 2025

Choose a reason for hiding this comment

Uh oh!

bartoszmajsak Jul 11, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

bartoszmajsak Jul 13, 2025

Choose a reason for hiding this comment

Uh oh!

pierDipi Jul 14, 2025

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

bartoszmajsak commented Jul 11, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

bartoszmajsak commented Jul 13, 2025

Uh oh!

coderabbitai Bot commented Jul 13, 2025

Uh oh!

coderabbitai Bot left a comment

Choose a reason for hiding this comment

Uh oh!

coderabbitai Bot Jul 13, 2025

Choose a reason for hiding this comment

Uh oh!

pierDipi left a comment

Choose a reason for hiding this comment

Uh oh!

bartoszmajsak commented Jul 11, 2025 •

edited

Loading

coderabbitai Bot commented Jul 11, 2025 •

edited

Loading

CodeRabbit Configuration File (`.coderabbit.yaml`)

bartoszmajsak Jul 11, 2025 •

edited

Loading

bartoszmajsak commented Jul 11, 2025 •

edited

Loading