Skip to content
Merged
Show file tree
Hide file tree
Changes from 37 commits
Commits
Show all changes
38 commits
Select commit Hold shift + click to select a range
3aff1b7
test(e2e): introduces e2e test skeleton and single-node cpu model
bartoszmajsak Jul 11, 2025
d3b53d2
fix: apply crds first and wait for them to be ready
bartoszmajsak Jul 24, 2025
c7e76bd
chore: limit gh-action tests to cpu
bartoszmajsak Jul 24, 2025
d2adcf3
fix: single worker job
bartoszmajsak Jul 24, 2025
5e0e874
feat: introduces cluster capability markers
bartoszmajsak Jul 25, 2025
2752ce1
chore: excludes tests from flake8
bartoszmajsak Jul 24, 2025
20e5869
chore: precommit fixes
bartoszmajsak Jul 24, 2025
5aceb3b
chore: no need for local pytest.ini as its ignored anyway
bartoszmajsak Jul 25, 2025
5299d05
lint: adds possibility to ignore unused warnings
bartoszmajsak Jul 25, 2025
2b3a4c6
precommit fixes
bartoszmajsak Jul 25, 2025
7b6f752
Fail early on CRDs
bartoszmajsak Jul 25, 2025
1b07471
chore: test/e2e/llmisvc/README.md
bartoszmajsak Jul 25, 2025
1121cd4
chore: simplifies test fixtures
bartoszmajsak Jul 25, 2025
1266477
fix: adjusts gh action to run on cpu cluster
bartoszmajsak Jul 25, 2025
f270cfa
feat: adds logging decorator
bartoszmajsak Jul 25, 2025
ba58345
feat: logging and cr dump
bartoszmajsak Jul 25, 2025
405cad4
chore: bumps python to 3.12 for e2e job
bartoszmajsak Jul 25, 2025
9587f18
fix: mismatched class name in example comment
bartoszmajsak Jul 25, 2025
dfade03
fix: clarifies workload preset
bartoszmajsak Jul 25, 2025
176da03
chore: removes noise
bartoszmajsak Jul 25, 2025
8210c45
fix: imports inference service config factory
bartoszmajsak Jul 28, 2025
e75b15e
chore: bumps resource limits
bartoszmajsak Jul 28, 2025
6be1e7c
chore: cleanup
bartoszmajsak Jul 28, 2025
e7b015c
feat: adds simple p/d deployment
bartoszmajsak Jul 28, 2025
a4ca343
feat: makes response timeout configurable with 60s default
bartoszmajsak Jul 28, 2025
1927931
midstream: disable gh-action
bartoszmajsak Jul 28, 2025
0623b62
chore: uses name variable
bartoszmajsak Jul 28, 2025
8d62551
chore: removes redundant preset
bartoszmajsak Jul 28, 2025
48d6b0b
fix: minor precommit linter findings
bartoszmajsak Jul 28, 2025
b56022a
chore: removes leftover empty file
bartoszmajsak Jul 28, 2025
a878d75
chore: reworks related resources dump to exclude certain kinds
bartoszmajsak Jul 28, 2025
492d47d
chore: minor fixes in README
bartoszmajsak Jul 29, 2025
01e714a
fix: filters out *List resources
bartoszmajsak Jul 29, 2025
6392bcd
chore: removes redundant init-container
bartoszmajsak Jul 29, 2025
6655269
fix: makes test params xdict-friendly
bartoszmajsak Jul 29, 2025
cc45a46
chore: clean up
bartoszmajsak Jul 29, 2025
1ba1e0c
review
bartoszmajsak Jul 29, 2025
03b15b5
precommit fix
bartoszmajsak Jul 29, 2025
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
2 changes: 1 addition & 1 deletion .flake8
Original file line number Diff line number Diff line change
@@ -1,6 +1,6 @@
[flake8]
max-line-length = 120
extend-ignore = E203,E501,E701, B017
extend-ignore = E203,E501,E701,B017,F401,F811
exclude =
.venv,
venv,
Expand Down
6 changes: 5 additions & 1 deletion .github/actions/kserve-dep-setup/action.yml
Original file line number Diff line number Diff line change
Expand Up @@ -13,6 +13,10 @@ inputs:
description: 'Enable KEDA for autoscaling'
required: false
default: 'false'
enable-lws:
description: 'Enable Leader Worker Set'
required: false
default: 'false'
runs:
using: "composite"
steps:
Expand All @@ -26,7 +30,7 @@ runs:
./test/scripts/gh-actions/setup-kourier.sh
else
echo "Selected network layer ${{ inputs.network-layer }}"
./test/scripts/gh-actions/setup-deps.sh ${{ inputs.deployment-mode }} "${{ inputs.network-layer }}" "${{ inputs.enable-keda }}"
./test/scripts/gh-actions/setup-deps.sh ${{ inputs.deployment-mode }} "${{ inputs.network-layer }}" "${{ inputs.enable-keda }}" "${{ inputs.enable-lws }}"
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

💡 Verification agent

🧩 Analysis chain

Arg list length changed – confirm script signature
setup-deps.sh is now invoked with four args. Ensure the shell script was updated accordingly; otherwise the fourth parameter will be silently ignored or shift positional parameters, breaking existing logic.


🏁 Script executed:

#!/bin/bash
# List the first lines of the helper script to verify it expects 4 args
head -n 20 test/scripts/gh-actions/setup-deps.sh

Length of output: 828


Update setup-deps.sh to handle four parameters
The action is now invoking the script with four args, but the script’s header and parameter parsing only expect two. This will misalign $1/$2 and drop $3/$4.

Please update:

  • test/scripts/gh-actions/setup-deps.sh
    • Extend the usage banner to include the new flags
    • Add parsing logic for $3 (enable-keda) and $4 (enable-lws)

Example diff to apply at the top of setup-deps.sh:

- # Usage: setup-deps.sh $DEPLOYMENT_MODE $NETWORK_LAYER
+ # Usage: setup-deps.sh $DEPLOYMENT_MODE $NETWORK_LAYER $ENABLE_KEDA $ENABLE_LWS

And then consume the extra params in the script, e.g.:

DEPLOYMENT_MODE=$1
NETWORK_LAYER=$2
ENABLE_KEDA=$3
ENABLE_LWS=$4
🤖 Prompt for AI Agents
In .github/actions/kserve-dep-setup/action.yml at line 33, the script
setup-deps.sh is called with four parameters, but the script currently only
expects two. To fix this, update test/scripts/gh-actions/setup-deps.sh by
extending its usage banner to document all four parameters and modify the
parameter parsing logic to assign $3 to ENABLE_KEDA and $4 to ENABLE_LWS
variables, ensuring all four inputs are correctly handled within the script.

fi

- name: Update test overlays
Expand Down
29 changes: 28 additions & 1 deletion .github/actions/minikube-setup/action.yml
Original file line number Diff line number Diff line change
Expand Up @@ -14,6 +14,10 @@ inputs:
description: 'Additional arguments to pass to minikube start'
required: false
default: ''
addons:
description: 'Choose optional addons to install. Valid options: metallb, ingress, gcp-auth, registry ...'
required: false
default: ''

runs:
using: "composite"
Expand All @@ -29,11 +33,34 @@ runs:
minikube-version: '1.35.0'
kubernetes-version: 'v1.30.7'
driver: ${{ inputs.driver }}
addons: ${{ inputs.addons }}
wait: 'all'
cpus: 'max'
memory: 'max'
start-args: --wait-timeout=6m0s --nodes=${{ inputs.nodes }} ${{ inputs.start-args }}

- name: Configure MetalLB for Minikube
if: ${{ contains(inputs.addons, 'metallb') }}
shell: bash
run: |
IP=$(minikube ip)
PREFIX=${IP%.*}
START=${PREFIX}.200
END=${PREFIX}.235

kubectl apply -f - <<EOF
apiVersion: v1
kind: ConfigMap
metadata:
namespace: metallb-system
name: config
data:
config: |
address-pools:
- name: default
protocol: layer2
addresses:
- ${START}-${END}
EOF
Comment on lines +41 to +63
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🛠️ Refactor suggestion

Race condition: MetalLB CRs applied before namespace exists

metallb-system namespace is created asynchronously by minikube addons.
If the namespace isn’t ready when this step runs, the kubectl apply will fail and the whole job stops.

Minimal fix – wait for the namespace:

-        IP=$(minikube ip)
+        # Wait until the metallb-system namespace appears (max 60s)
+        for i in {1..12}; do
+          kubectl get ns metallb-system && break || sleep 5
+        done
+
+        IP=$(minikube ip)
📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change
- name: Configure MetalLB for Minikube
if: ${{ contains(inputs.addons, 'metallb') }}
shell: bash
run: |
IP=$(minikube ip)
PREFIX=${IP%.*}
START=${PREFIX}.200
END=${PREFIX}.235
kubectl apply -f - <<EOF
apiVersion: v1
kind: ConfigMap
metadata:
namespace: metallb-system
name: config
data:
config: |
address-pools:
- name: default
protocol: layer2
addresses:
- ${START}-${END}
EOF
- name: Configure MetalLB for Minikube
if: ${{ contains(inputs.addons, 'metallb') }}
shell: bash
run: |
# Wait until the metallb-system namespace appears (max 60s)
for i in {1..12}; do
kubectl get ns metallb-system && break || sleep 5
done
IP=$(minikube ip)
PREFIX=${IP%.*}
START=${PREFIX}.200
END=${PREFIX}.235
kubectl apply -f - <<EOF
apiVersion: v1
kind: ConfigMap
metadata:
namespace: metallb-system
name: config
data:
config: |
address-pools:
- name: default
protocol: layer2
addresses:
- ${START}-${END}
EOF
🧰 Tools
🪛 YAMLlint (1.37.1)

[error] 49-49: trailing spaces

(trailing-spaces)

🤖 Prompt for AI Agents
In .github/actions/minikube-setup/action.yml between lines 41 and 63, the
MetalLB configuration is applied before ensuring the 'metallb-system' namespace
exists, causing a race condition. Fix this by adding a wait loop or command
before the kubectl apply step to check for the existence of the 'metallb-system'
namespace and only proceed once it is confirmed to exist, preventing the apply
command from failing.

- name: Check Kubernetes pods
shell: bash
run: kubectl get pods -n kube-system
176 changes: 176 additions & 0 deletions .github/workflows/e2e-test-llmisvc.yml
Copy link
Copy Markdown
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It's copy-paste from main e2e-test.yml workflow with new job - to avoid conflicts before we ship it upstream

Original file line number Diff line number Diff line change
@@ -0,0 +1,176 @@
name: LLMInferenceService E2E Tests

on:
pull_request:
branches: [master, release*, feature-llmd-* ]
paths:
- "**"
- "!.github/**"
- "!docs/**"
- "!**.md"
- ".github/workflows/e2e-test-llmisvc.yml"
workflow_dispatch:

env:
GITHUB_TOKEN: ${{ secrets.GITHUB_TOKEN }}
DOCKER_IMAGES_PATH: "/mnt/docker-images"
DOCKER_REPO: "kserve"
# artifact prefixes for bulk download
PREDICTOR_ARTIFACT_PREFIX: "pred"
EXPLAINER_ARTIFACT_PREFIX: "exp"
TRANSFORMER_ARTIFACT_PREFIX: "trans"
GRAPH_ARTIFACT_PREFIX: "graph"
BASE_ARTIFACT_PREFIX: "base"
# Controller images
CONTROLLER_IMG: "kserve-controller"
LOCALMODEL_CONTROLLER_IMG: "kserve-localmodel-controller"
LOCALMODEL_AGENT_IMG: "kserve-localmodelnode-agent"
STORAGE_INIT_IMG: "storage-initializer"
AGENT_IMG: "agent"
ROUTER_IMG: "router"
# Predictor runtime server images
SKLEARN_IMG: "sklearnserver"
XGB_IMG: "xgbserver"
LGB_IMG: "lgbserver"
PMML_IMG: "pmmlserver"
PADDLE_IMG: "paddleserver"
CUSTOM_MODEL_GRPC_IMG: "custom-model-grpc"
CUSTOM_MODEL_GRPC_IMG_TAG: "kserve/custom-model-grpc:${{ github.sha }}"
HUGGINGFACE_IMG: "huggingfaceserver"
# Explainer images
ART_IMG: "art-explainer"
# Transformer images
IMAGE_TRANSFORMER_IMG: "image-transformer"
IMAGE_TRANSFORMER_IMG_TAG: "kserve/image-transformer:${{ github.sha }}"
CUSTOM_TRANSFORMER_GRPC_IMG: "custom-image-transformer-grpc"
# Graph images
SUCCESS_200_ISVC_IMG: "success-200-isvc"
ERROR_404_ISVC_IMG: "error-404-isvc"

concurrency:
group: ${{ github.workflow }}-${{ github.ref }}
cancel-in-progress: true

jobs:
test-llmisvc:
if: false
runs-on: ubuntu-22.04
needs: [ kserve-image-build ]
steps:
- name: Checkout source
uses: actions/checkout@v4

- name: Free-up disk space
uses: ./.github/actions/free-up-disk-space

- name: Setup Go
uses: actions/setup-go@v5
with:
go-version-file: go.mod

- name: Setup Python
uses: actions/setup-python@v5
with:
python-version: "3.12"

- name: Setup Minikube
uses: ./.github/actions/minikube-setup
with:
addons: "metallb"

- name: KServe dependency setup
uses: ./.github/actions/kserve-dep-setup
with:
network-layer: 'istio-gatewayapi-ext'
enable-lws: 'true'

- name: Download base images
uses: ./.github/actions/base-download

- name: Install Poetry and version plugin
run: ./test/scripts/gh-actions/setup-poetry.sh

- name: Install KServe
run: |
./test/scripts/gh-actions/setup-kserve.sh "raw" "istio-gatewayapi-ext"

- name: Run E2E tests
timeout-minutes: 30
run: |
# Run only CPU tests for now using pytest markers (cluster_)
# Available GPU vendors: amd, nvidia, intel
./test/scripts/gh-actions/run-e2e-tests.sh "llminferenceservice and cluster_cpu" 2 "istio-gatewayapi-ext"

- name: Check system status
if: always()
run: |
./test/scripts/gh-actions/status-check.sh

kserve-image-build:
if: false
runs-on: ubuntu-latest
steps:
- name: Checkout source
uses: actions/checkout@v4

- name: Free-up disk space
uses: ./.github/actions/free-up-disk-space

- name: Setup Docker Buildx
uses: docker/setup-buildx-action@v3

- name: Build KServe images
run: |
sudo mkdir -p ${DOCKER_IMAGES_PATH}
sudo chown -R $USER ${DOCKER_IMAGES_PATH}
./test/scripts/gh-actions/build-images.sh
docker image ls
sudo ls -lh ${DOCKER_IMAGES_PATH}
Comment on lines +123 to +128
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue

Fix shell variable quoting issues.

The shell script execution has unquoted variables that could cause word splitting issues.

-          sudo mkdir -p ${DOCKER_IMAGES_PATH}
-          sudo chown -R $USER ${DOCKER_IMAGES_PATH}
+          sudo mkdir -p "${DOCKER_IMAGES_PATH}"
+          sudo chown -R "$USER" "${DOCKER_IMAGES_PATH}"
           ./test/scripts/gh-actions/build-images.sh
           docker image ls
-          sudo ls -lh ${DOCKER_IMAGES_PATH}
+          sudo ls -lh "${DOCKER_IMAGES_PATH}"
📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change
run: |
sudo mkdir -p ${DOCKER_IMAGES_PATH}
sudo chown -R $USER ${DOCKER_IMAGES_PATH}
./test/scripts/gh-actions/build-images.sh
docker image ls
sudo ls -lh ${DOCKER_IMAGES_PATH}
run: |
sudo mkdir -p "${DOCKER_IMAGES_PATH}"
sudo chown -R "$USER" "${DOCKER_IMAGES_PATH}"
./test/scripts/gh-actions/build-images.sh
docker image ls
sudo ls -lh "${DOCKER_IMAGES_PATH}"
🧰 Tools
🪛 actionlint (1.7.7)

120-120: shellcheck reported issue in this script: SC2086:info:1:15: Double quote to prevent globbing and word splitting

(shellcheck)


120-120: shellcheck reported issue in this script: SC2086:info:2:15: Double quote to prevent globbing and word splitting

(shellcheck)


120-120: shellcheck reported issue in this script: SC2086:info:2:21: Double quote to prevent globbing and word splitting

(shellcheck)


120-120: shellcheck reported issue in this script: SC2086:info:5:13: Double quote to prevent globbing and word splitting

(shellcheck)

🤖 Prompt for AI Agents
In .github/workflows/e2e-test-llmisvc.yml around lines 120 to 125, the shell
variables like ${DOCKER_IMAGES_PATH} and $USER are unquoted, which can lead to
word splitting or globbing issues if the variables contain spaces or special
characters. Fix this by adding double quotes around all variable references in
the shell commands, for example, change ${DOCKER_IMAGES_PATH} to
"${DOCKER_IMAGES_PATH}" and $USER to "$USER" to ensure safe and correct variable
expansion.

Comment thread
bartoszmajsak marked this conversation as resolved.

- name: Upload controller image
uses: actions/upload-artifact@v4
with:
name: ${{ env.BASE_ARTIFACT_PREFIX }}-${{ env.CONTROLLER_IMG }}-${{ github.sha }}
path: ${{ env.DOCKER_IMAGES_PATH }}/${{ env.CONTROLLER_IMG }}-${{ github.sha }}
compression-level: 0
if-no-files-found: error

- name: Upload localmodel controller image
uses: actions/upload-artifact@v4
with:
name: ${{ env.BASE_ARTIFACT_PREFIX }}-${{ env.LOCALMODEL_CONTROLLER_IMG }}-${{ github.sha }}
path: ${{ env.DOCKER_IMAGES_PATH }}/${{ env.LOCALMODEL_CONTROLLER_IMG }}-${{ github.sha }}
compression-level: 0
if-no-files-found: error

- name: Upload localmodel agent image
uses: actions/upload-artifact@v4
with:
name: ${{ env.BASE_ARTIFACT_PREFIX }}-${{ env.LOCALMODEL_AGENT_IMG }}-${{ github.sha }}
path: ${{ env.DOCKER_IMAGES_PATH }}/${{ env.LOCALMODEL_AGENT_IMG }}-${{ github.sha }}
compression-level: 0
if-no-files-found: error

- name: Upload agent image
uses: actions/upload-artifact@v4
with:
name: ${{ env.BASE_ARTIFACT_PREFIX }}-${{ env.AGENT_IMG }}-${{ github.sha }}
path: ${{ env.DOCKER_IMAGES_PATH }}/${{ env.AGENT_IMG }}-${{ github.sha }}
compression-level: 0
if-no-files-found: error

- name: Upload storage initializer image
uses: actions/upload-artifact@v4
with:
name: ${{ env.BASE_ARTIFACT_PREFIX }}-${{ env.STORAGE_INIT_IMG }}-${{ github.sha }}
path: ${{ env.DOCKER_IMAGES_PATH }}/${{ env.STORAGE_INIT_IMG }}-${{ github.sha }}
compression-level: 0
if-no-files-found: error

- name: Upload router image
uses: actions/upload-artifact@v4
with:
name: ${{ env.BASE_ARTIFACT_PREFIX }}-${{ env.ROUTER_IMG }}-${{ github.sha }}
path: ${{ env.DOCKER_IMAGES_PATH }}/${{ env.ROUTER_IMG }}-${{ github.sha }}
compression-level: 0
if-no-files-found: error
2 changes: 2 additions & 0 deletions Makefile
Original file line number Diff line number Diff line change
Expand Up @@ -253,6 +253,8 @@ deploy-dev-llm:
./hack/deploy_dev_llm.sh

deploy-ci: manifests
kubectl apply --server-side=true -k config/crd
kubectl wait --for=condition=established --timeout=60s crd/llminferenceserviceconfigs.serving.kserve.io
kubectl apply --server-side=true -k config/overlays/test
# TODO: Add runtimes as part of default deployment
kubectl wait --for=condition=ready pod -l control-plane=kserve-controller-manager -n kserve --timeout=300s
Expand Down
Loading
Loading