[WIP] test(e2e): introduces basic e2e test for a sample single node model#736
Conversation
|
Important Review skippedAuto reviews are disabled on base/target branches other than the default branch. Please check the settings in the CodeRabbit UI or the You can disable this status message by setting the """ WalkthroughThe changes introduce new LLM inference service model imports to the Changes
Sequence Diagram(s)sequenceDiagram
participant Tester
participant KServeClient
participant KubernetesAPI
participant LLMInferenceService
Tester->>KServeClient: Create LLMInferenceService CR
KServeClient->>KubernetesAPI: POST CR to cluster
Tester->>KServeClient: Wait for readiness (watch status)
KServeClient->>KubernetesAPI: GET/Watch CR status
KubernetesAPI-->>KServeClient: Status updates
KServeClient-->>Tester: Ready status
Tester->>LLMInferenceService: Send inference HTTP request
LLMInferenceService-->>Tester: Return inference result
Tester->>KServeClient: Delete LLMInferenceService CR
KServeClient->>KubernetesAPI: DELETE CR
Poem
""" Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out. 🪧 TipsChatThere are 3 ways to chat with CodeRabbit:
SupportNeed help? Create a ticket on our support page for assistance with any issues or questions. Note: Be mindful of the bot's finite context window. It's strongly recommended to break down tasks such as reading entire modules into smaller chunks. For a focused discussion, use review comments to chat about specific files and their changes, instead of using the PR comments. CodeRabbit Commands (Invoked using PR comments)
Other keywords and placeholders
CodeRabbit Configuration File (
|
|
[APPROVALNOTIFIER] This PR is NOT APPROVED This pull-request has been approved by: bartoszmajsak The full list of commands accepted by this bot can be found here. DetailsNeeds approval from an approver in each of these files:Approvers can indicate their approval by writing |
|
@coderabbitai review |
✅ Actions performedReview triggered.
|
There was a problem hiding this comment.
Actionable comments posted: 6
🧹 Nitpick comments (4)
python/kserve/kserve/__init__.py (1)
52-55: LGTM! New LLMInferenceService model imports follow the existing pattern.The addition of these v1alpha1 LLMInferenceService model classes to the package's public API is consistent with how other models are exposed.
Consider adding an
__all__list to this file in a future PR to explicitly define the public API and improve documentation/autocomplete support.test/e2e/llmisvc/test_llm_inference_service.py (3)
90-106: Consider enhancing test assertions.The test currently only validates HTTP 200 status and presence of "choices" field. As noted in the PR objectives TODO, more thorough assertions would improve test coverage.
Consider validating additional response fields such as:
- Response structure matches expected completion format
- Model name in response matches requested model
- Generated text content is not empty
- Response time is within acceptable limits
Would you like me to generate enhanced assertions for the test?
115-154: Consider moving LLMInferenceService operations to KServeClient.The TODO comment correctly identifies that these helper functions would be better placed in the KServeClient class for reusability across tests and applications.
Moving these functions to KServeClient would:
- Provide a consistent API for all KServe resource types
- Enable reuse in other tests and client applications
- Follow the existing pattern for InferenceService operations
Would you like me to create an issue to track this architectural improvement?
36-101: Acknowledge known limitations and future improvements.Based on the PR objectives, this test has known limitations including:
- Flaky wait conditions (currently being addressed)
- Manual gateway/destination rule setup required
- Basic assertions (HTTP 200 only)
The implementation provides a solid foundation for e2e testing of LLMInferenceService.
For the TODO items mentioned in the PR description (automating gateway setup, enhancing assertions, GitHub action integration), would you like me to help create tracking issues or provide implementation suggestions?
📜 Review details
Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro
📒 Files selected for processing (3)
python/kserve/kserve/__init__.py(1 hunks)test/e2e/llmisvc/test_llm_inference_service.py(1 hunks)test/e2e/pytest.ini(1 hunks)
🧰 Additional context used
🧠 Learnings (3)
📓 Common learnings
Learnt from: bartoszmajsak
PR: opendatahub-io/kserve#729
File: test/overlays/llm-istio-experimental/istiod.yaml:46-65
Timestamp: 2025-07-10T14:58:54.997Z
Learning: The file `test/overlays/llm-istio-experimental/istiod.yaml` is considered external and irrelevant for review in this repository.
Learnt from: bartoszmajsak
PR: opendatahub-io/kserve#729
File: test/overlays/llm-istio-experimental/istiod.yaml:2025-2049
Timestamp: 2025-07-10T14:58:43.762Z
Learning: The file `test/overlays/llm-istio-experimental/istiod.yaml` is considered external and irrelevant for review in this repository.
Learnt from: bartoszmajsak
PR: opendatahub-io/kserve#729
File: test/overlays/llm-istio-experimental/istio_base.yaml:6854-6869
Timestamp: 2025-07-10T14:58:14.161Z
Learning: The file `test/overlays/llm-istio-experimental/istio_base.yaml` is considered external and irrelevant for review in this repository.
Learnt from: hdefazio
PR: opendatahub-io/kserve#658
File: pkg/controller/v1beta1/inferenceservice/rawkube_controller_test.go:1873-1886
Timestamp: 2025-06-12T22:10:15.375Z
Learning: If a review comment is raised on code that was not introduced or changed by the current PR, and the user points this out, acknowledge it and clarify that no action is required from the PR author.
Learnt from: bartoszmajsak
PR: opendatahub-io/kserve#729
File: test/overlays/llm-istio-experimental/istiod.yaml:2559-2566
Timestamp: 2025-07-10T14:58:38.489Z
Learning: In the kserve repository, files in `test/overlays/llm-istio-experimental/` directory are considered external files by the maintainers and should not be flagged for YAML formatting issues or other code quality concerns.
test/e2e/pytest.ini (3)
Learnt from: bartoszmajsak
PR: opendatahub-io/kserve#729
File: test/overlays/llm-istio-experimental/istiod.yaml:2559-2566
Timestamp: 2025-07-10T14:58:38.489Z
Learning: In the kserve repository, files in `test/overlays/llm-istio-experimental/` directory are considered external files by the maintainers and should not be flagged for YAML formatting issues or other code quality concerns.
Learnt from: bartoszmajsak
PR: opendatahub-io/kserve#729
File: test/overlays/llm-istio-experimental/istiod.yaml:2025-2049
Timestamp: 2025-07-10T14:58:43.762Z
Learning: The file `test/overlays/llm-istio-experimental/istiod.yaml` is considered external and irrelevant for review in this repository.
Learnt from: bartoszmajsak
PR: opendatahub-io/kserve#729
File: test/overlays/llm-istio-experimental/istiod.yaml:46-65
Timestamp: 2025-07-10T14:58:54.997Z
Learning: The file `test/overlays/llm-istio-experimental/istiod.yaml` is considered external and irrelevant for review in this repository.
test/e2e/llmisvc/test_llm_inference_service.py (1)
Learnt from: bartoszmajsak
PR: opendatahub-io/kserve#729
File: test/overlays/llm-istio-experimental/istiod.yaml:2559-2566
Timestamp: 2025-07-10T14:58:38.489Z
Learning: In the kserve repository, files in `test/overlays/llm-istio-experimental/` directory are considered external files by the maintainers and should not be flagged for YAML formatting issues or other code quality concerns.
🧬 Code Graph Analysis (1)
test/e2e/llmisvc/test_llm_inference_service.py (4)
python/kserve/kserve/api/kserve_client.py (2)
KServeClient(31-834)get(166-223)python/kserve/kserve/models/v1alpha1_llm_inference_service.py (1)
V1alpha1LLMInferenceService(35-242)python/kserve/kserve/utils/utils.py (1)
get_isvc_namespace(49-50)python/kserve/kserve/exceptions.py (1)
ApiException(98-123)
🪛 Ruff (0.11.9)
python/kserve/kserve/__init__.py
52-52: .models.v1alpha1_llm_inference_service.V1alpha1LLMInferenceService imported but unused; consider removing, adding to __all__, or using a redundant alias
(F401)
53-53: .models.v1alpha1_llm_inference_service_config.V1alpha1LLMInferenceServiceConfig imported but unused; consider removing, adding to __all__, or using a redundant alias
(F401)
54-54: .models.v1alpha1_llm_inference_service_config_list.V1alpha1LLMInferenceServiceConfigList imported but unused; consider removing, adding to __all__, or using a redundant alias
(F401)
55-55: .models.v1alpha1_llm_inference_service_list.V1alpha1LLMInferenceServiceList imported but unused; consider removing, adding to __all__, or using a redundant alias
(F401)
test/e2e/llmisvc/test_llm_inference_service.py
15-15: asyncio imported but unused
Remove unused import: asyncio
(F401)
16-16: json imported but unused
Remove unused import: json
(F401)
21-21: kserve.constants imported but unused
Remove unused import: kserve.constants
(F401)
143-145: Within an except clause, raise exceptions with raise ... from err or raise ... from None to distinguish them from errors in exception handling
(B904)
175-177: Within an except clause, raise exceptions with raise ... from err or raise ... from None to distinguish them from errors in exception handling
(B904)
197-199: Within an except clause, raise exceptions with raise ... from err or raise ... from None to distinguish them from errors in exception handling
(B904)
245-245: Within an except clause, raise exceptions with raise ... from err or raise ... from None to distinguish them from errors in exception handling
(B904)
🪛 Flake8 (7.2.0)
test/e2e/llmisvc/test_llm_inference_service.py
[error] 15-15: 'asyncio' imported but unused
(F401)
[error] 16-16: 'json' imported but unused
(F401)
[error] 21-21: 'kserve.constants' imported but unused
(F401)
[error] 312-312: visually indented line with same indent as next logical line
(E129)
🪛 GitHub Actions: Precommit Check
test/e2e/llmisvc/test_llm_inference_service.py
[error] 15-15: flake8: 'asyncio' imported but unused (F401)
[error] 16-16: flake8: 'json' imported but unused (F401)
[error] 21-21: flake8: 'kserve.constants' imported but unused (F401)
[error] 108-108: flake8: too many leading '#' for block comment (E266)
🔇 Additional comments (1)
test/e2e/pytest.ini (1)
20-20: LGTM! New pytest marker for LLM inference service tests.The marker follows the existing naming convention and will enable selective execution of LLM inference service e2e tests.
| raise RuntimeError( | ||
| f"Exception when calling CustomObjectsApi->create_namespaced_custom_object for LLMInferenceService: {e}" | ||
| ) |
There was a problem hiding this comment.
🛠️ Refactor suggestion
Improve exception handling by preserving the exception chain.
Use raise ... from e to maintain the exception chain for better debugging and error tracing.
Apply these diffs to fix exception handling:
In create_llmisvc:
except client.rest.ApiException as e:
raise RuntimeError(
f"Exception when calling CustomObjectsApi->create_namespaced_custom_object for LLMInferenceService: {e}"
- )
+ ) from eIn delete_llmisvc:
except client.rest.ApiException as e:
raise RuntimeError(
f"Exception when calling CustomObjectsApi->delete_namespaced_custom_object for LLMInferenceService: {e}"
- )
+ ) from eIn get_llmisvc:
except client.rest.ApiException as e:
raise RuntimeError(
f"Exception when calling CustomObjectsApi->get_namespaced_custom_object for LLMInferenceService: {e}"
- )
+ ) from eIn get_llm_service_url:
except Exception as e:
- raise ValueError(f"Failed to get URL for LLM inference service {service_name}: {e}")
+ raise ValueError(f"Failed to get URL for LLM inference service {service_name}: {e}") from eAlso applies to: 175-177, 197-199, 245-245
🧰 Tools
🪛 Ruff (0.11.9)
143-145: Within an except clause, raise exceptions with raise ... from err or raise ... from None to distinguish them from errors in exception handling
(B904)
🤖 Prompt for AI Agents
In test/e2e/llmisvc/test_llm_inference_service.py at lines 143-145 and similarly
at lines 175-177, 197-199, and 245, improve exception handling by changing the
raise statements to use "raise ... from e" syntax. This preserves the original
exception context and stack trace. Locate each raise RuntimeError statement that
includes the exception variable e and modify it to "raise RuntimeError(...) from
e" to maintain the exception chain for better debugging.
| async def wait_llm_isvc_ready( | ||
| kserve_client, | ||
| name, | ||
| namespace, | ||
| timeout_seconds=600, | ||
| version=KSERVE_V1ALPHA1_VERSION | ||
| ): | ||
| """ | ||
| Wait for LLM inference service to be ready by checking all conditions | ||
| :param kserve_client: KServe client instance | ||
| :param name: LLM inference service name | ||
| :param namespace: namespace | ||
| :param timeout_seconds: timeout seconds for waiting, default to 600s | ||
| :param version: api group version | ||
| :return: ready LLM inference service object | ||
| """ | ||
| llm_isvc_watch(name=name, namespace=namespace, timeout_seconds=timeout_seconds) | ||
|
|
||
| llm_isvc = get_llmisvc(kserve_client, name, namespace, version) | ||
|
|
||
| return llm_isvc | ||
|
|
There was a problem hiding this comment.
🛠️ Refactor suggestion
Remove unnecessary async declaration or implement true async waiting.
The wait_llm_isvc_ready function is declared as async but doesn't perform any asynchronous operations. It calls the synchronous llm_isvc_watch function.
Either remove the async declaration:
-async def wait_llm_isvc_ready(
+def wait_llm_isvc_ready(
kserve_client,
name,
namespace,
timeout_seconds=600,
version=KSERVE_V1ALPHA1_VERSION
):And update the test to not await it:
- await wait_llm_isvc_ready(kserve_client, service_name, KSERVE_TEST_NAMESPACE)
+ wait_llm_isvc_ready(kserve_client, service_name, KSERVE_TEST_NAMESPACE)Or implement true async waiting using asyncio if needed for concurrent operations.
Committable suggestion skipped: line range outside the PR's diff.
🤖 Prompt for AI Agents
In test/e2e/llmisvc/test_llm_inference_service.py around lines 202 to 223, the
function wait_llm_isvc_ready is declared async but only calls synchronous
functions without awaiting any async operations. To fix this, either remove the
async keyword from the function definition and update all calls to it to not use
await, or refactor the function to perform actual asynchronous waiting using
asyncio constructs if concurrent async behavior is required.
| finally: | ||
| try: | ||
| print(f"{service_url}/v1/completions") | ||
| delete_llmisvc(kserve_client, service_name, KSERVE_TEST_NAMESPACE) | ||
| except Exception as e: | ||
| print(f"Warning: Failed to cleanup service {service_name}: {e}") |
There was a problem hiding this comment.
🛠️ Refactor suggestion
Move debug print statement before cleanup.
The print statement is currently in the finally block after the service might be deleted, which could be confusing for debugging.
Apply this diff to improve the code structure:
assert response.status_code == 200, f"Expected 200 but got {response.status_code}: {response.text}"
response_data = response.json()
assert "choices" in response_data, "Response should contain 'choices' field"
+ print(f"Successfully tested endpoint: {completion_url}")
+
finally:
try:
- print(f"{service_url}/v1/completions")
delete_llmisvc(kserve_client, service_name, KSERVE_TEST_NAMESPACE)📝 Committable suggestion
‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.
| finally: | |
| try: | |
| print(f"{service_url}/v1/completions") | |
| delete_llmisvc(kserve_client, service_name, KSERVE_TEST_NAMESPACE) | |
| except Exception as e: | |
| print(f"Warning: Failed to cleanup service {service_name}: {e}") | |
| assert response.status_code == 200, f"Expected 200 but got {response.status_code}: {response.text}" | |
| response_data = response.json() | |
| assert "choices" in response_data, "Response should contain 'choices' field" | |
| print(f"Successfully tested endpoint: {completion_url}") | |
| finally: | |
| try: | |
| delete_llmisvc(kserve_client, service_name, KSERVE_TEST_NAMESPACE) | |
| except Exception as e: | |
| print(f"Warning: Failed to cleanup service {service_name}: {e}") |
🧰 Tools
🪛 GitHub Actions: Precommit Check
[error] 108-108: flake8: too many leading '#' for block comment (E266)
🤖 Prompt for AI Agents
In test/e2e/llmisvc/test_llm_inference_service.py around lines 108 to 113, move
the debug print statement that outputs the service URL to before the call to
delete_llmisvc. This ensures the URL is printed before the service deletion
attempt, improving clarity during debugging. Keep the try-except structure to
catch and log any exceptions during cleanup.
| if (ready == "True" and | ||
| workloads_ready == "True" and | ||
| router_ready == "True" and | ||
| main_workload_ready == "True"): | ||
| print("All conditions ready") |
There was a problem hiding this comment.
Fix indentation issue in multi-line condition.
The condition continuation has incorrect indentation causing a flake8 E129 error.
Apply this diff to fix the indentation:
if (ready == "True" and
workloads_ready == "True" and
- router_ready == "True" and
- main_workload_ready == "True"):
+ router_ready == "True" and
+ main_workload_ready == "True"):
print("All conditions ready")📝 Committable suggestion
‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.
| if (ready == "True" and | |
| workloads_ready == "True" and | |
| router_ready == "True" and | |
| main_workload_ready == "True"): | |
| print("All conditions ready") | |
| if (ready == "True" and | |
| workloads_ready == "True" and | |
| router_ready == "True" and | |
| main_workload_ready == "True"): | |
| print("All conditions ready") |
🧰 Tools
🪛 Flake8 (7.2.0)
[error] 312-312: visually indented line with same indent as next logical line
(E129)
🤖 Prompt for AI Agents
In test/e2e/llmisvc/test_llm_inference_service.py around lines 309 to 313, the
multi-line if condition has incorrect indentation causing a flake8 E129 error.
Fix this by aligning the continued lines properly under the opening parenthesis,
ensuring consistent indentation for each condition line to comply with style
guidelines.
Jooho
left a comment
There was a problem hiding this comment.
Follow DEV.md to set up KinD, create Gateway and Destination Rule
Shouldn't the e2e test include creating this Gateway and Destination Rule?
| "template": { | ||
| "containers": [{ | ||
| "name": "main", | ||
| "image": "quay.io/pierdipi/vllm-cpu:latest", |
There was a problem hiding this comment.
Would it be okay to use images from a personal repository for testing purposes?
There was a problem hiding this comment.
Yes, it's wip. We should use official one for the real thing, but I think that very image was not published lately and the latest version was buggy, that's why we published newer one. FWIW that's not any different to what we have in sample and what we used in the demo.
But I agree - we should address this problem.
There was a problem hiding this comment.
maybe a quick improvement for now is that we could push that same image to the opendatahub Quay org to avoid referencing mine but it's ok to stay on that personal one for the purpose of this PR (perhaps extract a constant for quick future edits)
It's part of PRs TODOs :)
|
1ed9a72 to
d5d7902
Compare
|
@coderabbitai review |
✅ Actions performedReview triggered.
|
There was a problem hiding this comment.
Actionable comments posted: 1
♻️ Duplicate comments (4)
test/e2e/llmisvc/test_llm_inference_service.py (4)
113-115: Fix exception handling by preserving the exception chain.Use
raise ... from eto maintain the exception chain for better debugging and error tracing.
138-140: Fix exception handling by preserving the exception chain.Use
raise ... from eto maintain the exception chain for better debugging and error tracing.
163-165: Fix exception handling by preserving the exception chain.Use
raise ... from eto maintain the exception chain for better debugging and error tracing.
258-260: Fix exception handling by preserving the exception chain.Use
raise ... from eto maintain the exception chain for better debugging and error tracing.
🧹 Nitpick comments (2)
test/e2e/llmisvc/test_llm_inference_service.py (2)
50-50: Consider using official container images for production tests.The test uses a personal repository image (
quay.io/pierdipi/vllm-cpu:latest). For production-ready tests, consider using official images to ensure consistency and reliability.Do you want me to help identify official alternatives or create an issue to track this improvement?
213-213: Improve polling efficiency with exponential backoff.The current polling uses a fixed 10-second interval which may be inefficient. Consider implementing exponential backoff for faster initial checks while avoiding overwhelming the service.
Apply this diff to improve the polling strategy:
- wait_for(assert_model_responds, timeout=timeout_seconds, interval=10.0) + wait_for(assert_model_responds, timeout=timeout_seconds, interval=2.0)And consider enhancing the
wait_forfunction with exponential backoff:def wait_for(assertion_fn, timeout: float = 5.0, initial_interval: float = 0.1, max_interval: float = 10.0): """ Repeatedly calls assertion_fn() with exponential backoff until success or timeout. """ deadline = time.time() + timeout interval = initial_interval while True: try: return assertion_fn() except AssertionError: if time.time() >= deadline: raise time.sleep(interval) interval = min(interval * 1.5, max_interval)
📜 Review details
Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro
📒 Files selected for processing (3)
python/kserve/kserve/__init__.py(1 hunks)test/e2e/llmisvc/test_llm_inference_service.py(1 hunks)test/e2e/pytest.ini(1 hunks)
✅ Files skipped from review due to trivial changes (1)
- test/e2e/pytest.ini
🧰 Additional context used
🧠 Learnings (2)
📓 Common learnings
Learnt from: bartoszmajsak
PR: opendatahub-io/kserve#729
File: test/overlays/llm-istio-experimental/istiod.yaml:2025-2049
Timestamp: 2025-07-10T14:58:43.762Z
Learning: The file `test/overlays/llm-istio-experimental/istiod.yaml` is considered external and irrelevant for review in this repository.
Learnt from: bartoszmajsak
PR: opendatahub-io/kserve#729
File: test/overlays/llm-istio-experimental/istiod.yaml:46-65
Timestamp: 2025-07-10T14:58:54.997Z
Learning: The file `test/overlays/llm-istio-experimental/istiod.yaml` is considered external and irrelevant for review in this repository.
Learnt from: bartoszmajsak
PR: opendatahub-io/kserve#729
File: test/overlays/llm-istio-experimental/istio_base.yaml:6854-6869
Timestamp: 2025-07-10T14:58:14.161Z
Learning: The file `test/overlays/llm-istio-experimental/istio_base.yaml` is considered external and irrelevant for review in this repository.
Learnt from: hdefazio
PR: opendatahub-io/kserve#658
File: pkg/controller/v1beta1/inferenceservice/rawkube_controller_test.go:1873-1886
Timestamp: 2025-06-12T22:10:15.375Z
Learning: If a review comment is raised on code that was not introduced or changed by the current PR, and the user points this out, acknowledge it and clarify that no action is required from the PR author.
Learnt from: bartoszmajsak
PR: opendatahub-io/kserve#729
File: test/overlays/llm-istio-experimental/istiod.yaml:2559-2566
Timestamp: 2025-07-10T14:58:38.489Z
Learning: In the kserve repository, files in `test/overlays/llm-istio-experimental/` directory are considered external files by the maintainers and should not be flagged for YAML formatting issues or other code quality concerns.
test/e2e/llmisvc/test_llm_inference_service.py (1)
Learnt from: bartoszmajsak
PR: opendatahub-io/kserve#729
File: test/overlays/llm-istio-experimental/istiod.yaml:2559-2566
Timestamp: 2025-07-10T14:58:38.489Z
Learning: In the kserve repository, files in `test/overlays/llm-istio-experimental/` directory are considered external files by the maintainers and should not be flagged for YAML formatting issues or other code quality concerns.
🪛 Ruff (0.11.9)
test/e2e/llmisvc/test_llm_inference_service.py
113-115: Within an except clause, raise exceptions with raise ... from err or raise ... from None to distinguish them from errors in exception handling
(B904)
138-140: Within an except clause, raise exceptions with raise ... from err or raise ... from None to distinguish them from errors in exception handling
(B904)
163-165: Within an except clause, raise exceptions with raise ... from err or raise ... from None to distinguish them from errors in exception handling
(B904)
191-191: Within an except clause, raise exceptions with raise ... from err or raise ... from None to distinguish them from errors in exception handling
(B904)
204-204: Within an except clause, raise exceptions with raise ... from err or raise ... from None to distinguish them from errors in exception handling
(B904)
258-260: Within an except clause, raise exceptions with raise ... from err or raise ... from None to distinguish them from errors in exception handling
(B904)
python/kserve/kserve/__init__.py
52-52: .models.v1alpha1_llm_inference_service.V1alpha1LLMInferenceService imported but unused; consider removing, adding to __all__, or using a redundant alias
(F401)
54-54: .models.v1alpha1_llm_inference_service_config.V1alpha1LLMInferenceServiceConfig imported but unused; consider removing, adding to __all__, or using a redundant alias
(F401)
57-57: .models.v1alpha1_llm_inference_service_config_list.V1alpha1LLMInferenceServiceConfigList imported but unused; consider removing, adding to __all__, or using a redundant alias
(F401)
59-59: .models.v1alpha1_llm_inference_service_list.V1alpha1LLMInferenceServiceList imported but unused; consider removing, adding to __all__, or using a redundant alias
(F401)
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (7)
- GitHub Check: precommit-check
- GitHub Check: build (3.9)
- GitHub Check: build (3.11)
- GitHub Check: build (3.12)
- GitHub Check: test
- GitHub Check: build (3.10)
- GitHub Check: Build
🔇 Additional comments (1)
python/kserve/kserve/__init__.py (1)
52-59: LGTM! New LLM inference service models properly exposed in public API.The new imports correctly expose the LLM inference service model classes in the package's public API, following the established pattern. The static analysis warnings about unused imports are false positives - these imports are intended for external consumption.
| @pytest.mark.llminferenceservice | ||
| @pytest.mark.asyncio(scope="session") | ||
| async def test_llm_inference_service_facebook_opt(): |
There was a problem hiding this comment.
🛠️ Refactor suggestion
Remove unnecessary async marker or add async operations.
The test function is marked with @pytest.mark.asyncio but doesn't contain any await statements or async operations. This creates unnecessary overhead.
Apply this diff to remove the async marker:
@pytest.mark.llminferenceservice
-@pytest.mark.asyncio(scope="session")
-async def test_llm_inference_service_facebook_opt():
+def test_llm_inference_service_facebook_opt():📝 Committable suggestion
‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.
| @pytest.mark.llminferenceservice | |
| @pytest.mark.asyncio(scope="session") | |
| async def test_llm_inference_service_facebook_opt(): | |
| @pytest.mark.llminferenceservice | |
| def test_llm_inference_service_facebook_opt(): |
🤖 Prompt for AI Agents
In test/e2e/llmisvc/test_llm_inference_service.py around lines 31 to 33, the
test function is marked as async with @pytest.mark.asyncio but contains no await
or async operations. Remove the @pytest.mark.asyncio decorator and the async
keyword from the function definition to eliminate unnecessary async overhead.
…odel > [!IMPORTANT] > First commit, fixing Python packages can be upstreamed separately. ### How to run Follow `DEV.md` to set up KinD, create Gateway and Destination Rule resources. #### Set up python venv ```shell python -m venv venv source venv/bin/activate cd python/kserve pip install -e . pip install pytest pytest-asyncio tabulate requests cd - ``` #### Execute the test ```shell kind get kubeconfig --name kserve-llm-d > kubeconfig.e2e KUBECONFIG=kubeconfig.e2e pytest test/e2e/llmisvc/test_llm_inference_service.py::test_llm_inference_service_facebook_opt -v ``` > [!IMPORTANT] > It seems that wait conditions are a bit flaky. If you set everything up before starting the test (and comment out create/delete LLMISVC in the test code), the test passes. ### TODO - make watch funcs a bit less verbose :) - improve assertions (it's now checking `200 OK` only) - automate setup (gateway and destination rule) - include as github action - expand test scenarios Signed-off-by: Bartosz Majsak <bartosz.majsak@gmail.com>
d5d7902 to
2fa554a
Compare
pierDipi
left a comment
There was a problem hiding this comment.
This is great!
I'd love to see the test instructions in the PR body added to DEV doc for future quick reference.
Generally LGTM!
|
Closes in favor of #774 |
Important
First commit, fixing Python packages can be upstreamed separately.
How to run
Follow
DEV.mdto set up KinD, create Gateway and Destination Ruleresources.
Set up python venv
Execute the test
kind get kubeconfig --name kserve-llm-d > kubeconfig.e2e KUBECONFIG=kubeconfig.e2e pytest test/e2e/llmisvc/test_llm_inference_service.py::test_llm_inference_service_facebook_opt -vTODO
200 OKonly)Summary by CodeRabbit
New Features
Tests