feat(helm): add LLMGateway helm chart by yezz123 · Pull Request #1998 · theopenco/llmgateway

yezz123 · 2026-04-09T22:35:17Z

This pull request introduces a Helm chart for deploying LLM Gateway on Kubernetes, along with comprehensive documentation and templating for multiple components and configurations.

The changes add all necessary Helm chart files, Kubernetes manifests for various services, and documentation updates to guide users in deploying LLM Gateway via Helm.

Helm Chart Introduction and Documentation:

Added a new Helm chart for LLM Gateway, including a Chart.yaml with metadata and configuration for Kubernetes deployments.
Added a README.md in the Helm chart directory with installation instructions and usage details.
Updated the self-hosting documentation to mention Helm deployment and link to further Helm chart documentation.

cc @steebchen @smakosh 🫡

Summary by CodeRabbit

New Features
- Added a Helm chart to deploy and configure all application components (API, gateway, UI, worker, docs, admin, playground, code)
- Built-in PostgreSQL and Redis with optional external DB/Redis support, ingress support, and extensive runtime/config overrides (images, replicas, resources, env, node selection, secrets, LLM provider keys)
- Chart is published to a container registry on releases for easy helm install
Documentation
- Updated self-hosting docs with Helm installation, version pinning, and links to chart resources

coderabbitai · 2026-04-09T22:35:34Z

Walkthrough

Adds a new Helm chart for LLM Gateway (chart metadata, helpers, templates, values), CI job to publish the chart to GHCR, and documentation updates with Helm install instructions for deploying to Kubernetes.

Changes

Cohort / File(s)	Summary
Documentation `apps/docs/content/self-host.mdx`, `infra/helm/README.md`	Added Helm installation docs and links to chart README and version listing; shows OCI install command and `--version` pinning guidance.
Chart Metadata & Helpers `infra/helm/llmgateway/Chart.yaml`, `infra/helm/llmgateway/templates/_helpers.tpl`	New Chart.yaml and helper templates providing naming, labels, image resolution, secret/config names, DB/Redis host/port helpers, and service URL builders.
Component Deployments `infra/helm/llmgateway/templates/*-deployment.yaml` (admin, api, code, docs, gateway, playground, ui, worker)	Added conditional Deployment templates for 8 components with replicas, probes, env/config wiring, resources, scheduling, and init containers (API/Gateway/Worker) for DB/Redis readiness.
Component Services `infra/helm/llmgateway/templates/*-service.yaml` (admin, api, code, docs, gateway, playground, ui)	Added Service templates for each component with configurable service type (default ClusterIP) and selector/labels.
Stateful Data Plane `infra/helm/llmgateway/templates/postgresql-statefulset.yaml`, `.../postgresql-service.yaml`, `.../redis-statefulset.yaml`, `.../redis-service.yaml`	Added StatefulSet and Service templates for PostgreSQL and Redis with persistence, credentials, probes, and optional password/auth handling.
Config & Secrets `infra/helm/llmgateway/templates/configmap.yaml`, `infra/helm/llmgateway/templates/secret.yaml`	Added ConfigMap with many runtime keys (URLs, auth, gateway/API/database/worker/billing/observability/content-filtering) and a Secret template emitting DB/Redis credentials, OAuth keys, LLM provider keys, Stripe, email, and webhook secrets.
Ingress & Top-level Values `infra/helm/llmgateway/templates/ingress.yaml`, `infra/helm/llmgateway/values.yaml`	Ingress template with per-component host rules and optional TLS; comprehensive values.yaml (≈450 lines) defining defaults for all components, external/internal DB/Redis, providers, billing, observability, and feature toggles.
CI: Publish Chart `.github/workflows/images.yml`	New `publish-helm` workflow job: on release, installs Helm, updates Chart.yaml version/appVersion, packages chart and pushes OCI chart to GHCR under `charts/<OWNER>`.

Sequence Diagram(s)

sequenceDiagram
  participant Release as "GitHub Release"
  participant Actions as "GitHub Actions (publish-helm)"
  participant GHCR as "GitHub Container Registry (OCI)"
  participant User as "Operator (helm CLI)"
  participant K8s as "Kubernetes API"

  Release->>Actions: trigger publish-helm
  Actions->>Actions: update Chart.yaml (version/appVersion)
  Actions->>GHCR: push OCI chart package
  User->>GHCR: helm install oci://ghcr.io/.../llmgateway
  GHCR->>User: chart blob
  User->>K8s: Helm renders templates / apply manifests
  K8s->>K8s: create ConfigMap, Secret, Services, Deployments, StatefulSets, Ingress

Estimated code review effort

🎯 3 (Moderate) | ⏱️ ~22 minutes

Possibly related PRs

ci: trigger infra update after image merge #1777 — also modifies .github/workflows/images.yml (related CI workflow changes).

🚥 Pre-merge checks | ✅ 3

✅ Passed checks (3 passed)

Check name	Status	Explanation
Description Check	✅ Passed	Check skipped - CodeRabbit’s high-level summary is enabled.
Title check	✅ Passed	The pull request title 'feat(helm): add LLMGateway helm chart' accurately describes the main change—introducing a new Helm chart for LLM Gateway deployment.
Docstring Coverage	✅ Passed	No functions found in the changed files to evaluate docstring coverage. Skipping docstring coverage check.

_{✏️ Tip: You can configure your own custom pre-merge checks in the settings.}

✨ Finishing Touches

🧪 Generate unit tests (beta)

Create PR with unit tests

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

_{Comment @coderabbitai help to get the list of available commands and usage tips.}

coderabbitai

Actionable comments posted: 12

🧹 Nitpick comments (5)

infra/helm/llmgateway/Chart.yaml (1)

7-7: Use a concrete appVersion instead of latest for proper release tracking.

Line 7 uses appVersion: "latest", which makes chart metadata non-deterministic. According to Helm best practices, appVersion should reflect the actual application version included in the chart (e.g., matching the default image tag deployed by the chart) to enable proper release tracking and automation.
🔧 Suggested change
-appVersion: "latest"
+appVersion: "0.0.0"
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@infra/helm/llmgateway/Chart.yaml` at line 7, Replace the non-deterministic
appVersion value in Chart.yaml (appVersion: "latest") with a concrete
application version string that matches the chart’s default image tag; update
the appVersion field to the actual semantic version (e.g., "v1.2.3") and, if
necessary, align values.yaml image.tag (or any default image tag referenced by
the chart) and CI/release automation to keep them in sync so chart metadata is
deterministic and traceable.

infra/helm/llmgateway/templates/ingress.yaml (1)

27-110: Consider generating rules via a loop to reduce drift risk.

The repeated component blocks are easy to desync (host key, service suffix, enable flag). A small range-driven structure would make future additions safer.

🤖 Prompt for AI Agents

Verify each finding against the current code and only fix it if needed.

In `@infra/helm/llmgateway/templates/ingress.yaml` around lines 27 - 110, Replace
the repeated per-component ingress blocks with a single range that iterates over
a map/list of component keys and generates host, enabled check and service name
dynamically; e.g., range over
["ui","api","gateway","playground","code","docs","admin"] and for each key,
check .Values.<key>.enabled and .Values.ingress.hosts.<key>, use the host value
and build the backend service name with include "llmgateway.fullname" . | printf
"%s-%s" (include "llmgateway.fullname" .) key and port name http; this keeps the
logic centralized (references: .Values.ui.enabled, .Values.ingress.hosts.<key>,
include "llmgateway.fullname") and removes the duplicated blocks.

infra/helm/llmgateway/templates/gateway-deployment.yaml (2)

22-42: Init containers can wait indefinitely without timeout.

The wait-for-postgresql and wait-for-redis init containers loop forever until the services are reachable. If the dependent services are misconfigured or fail to start, the pod will remain in Init state indefinitely.

Consider adding a timeout or max retry count to fail fast and surface the issue.

♻️ Proposed fix with timeout

        - name: wait-for-postgresql
          image: busybox:1.37
          command:
            - sh
            - -c
            - |
+             TIMEOUT=300
+             ELAPSED=0
              until nc -z {{ include "llmgateway.postgresql.host" . }} {{ include "llmgateway.postgresql.port" . }}; do
                echo "Waiting for PostgreSQL..."
                sleep 2
+               ELAPSED=$((ELAPSED + 2))
+               if [ $ELAPSED -ge $TIMEOUT ]; then
+                 echo "Timeout waiting for PostgreSQL"
+                 exit 1
+               fi
              done

🤖 Prompt for AI Agents

Verify each finding against the current code and only fix it if needed.

In `@infra/helm/llmgateway/templates/gateway-deployment.yaml` around lines 22 -
42, The init containers wait-for-postgresql and wait-for-redis can hang forever;
modify their command scripts to implement a timeout or max-retry and exit
non-zero when exceeded so the pod fails fast. Update the command blocks for the
initContainers (names: wait-for-postgresql, wait-for-redis) to track either a
retry counter or elapsed time (e.g., MAX_RETRIES or TIMEOUT_SECONDS), loop until
nc -z succeeds or the limit is reached, print a clear error message and exit 1
on timeout; optionally expose the timeout/retry via a Helm value so it’s
configurable.

43-77: Consider adding securityContext for defense-in-depth.

Trivy flags missing securityContext.readOnlyRootFilesystem (KSV-0014) and default security context (KSV-0118). While not strictly required, adding a restrictive security context improves the security posture.

🛡️ Suggested securityContext addition

      containers:
        - name: gateway
          image: {{ include "llmgateway.image" (dict "context" . "image" .Values.gateway.image) }}
          imagePullPolicy: {{ .Values.gateway.image.pullPolicy | default "IfNotPresent" }}
+         securityContext:
+           readOnlyRootFilesystem: true
+           runAsNonRoot: true
+           allowPrivilegeEscalation: false
          ports:

🤖 Prompt for AI Agents

Verify each finding against the current code and only fix it if needed.

In `@infra/helm/llmgateway/templates/gateway-deployment.yaml` around lines 43 -
77, Add a restrictive security context to the gateway container and a default
pod securityContext for defense-in-depth: under the containers block for the
container named "gateway" add a securityContext with readOnlyRootFilesystem:
true, runAsNonRoot: true, runAsUser (e.g. 1000), allowPrivilegeEscalation:
false, capabilities: drop: ["ALL"], and seccompProfile: { type: "RuntimeDefault"
}; additionally add a podSecurityContext on the Pod template spec
(spec.template.spec) to set fsGroup (e.g. 1000) and runAsNonRoot:true to provide
a default baseline for all containers. Ensure you add these keys adjacent to
existing fields (e.g., alongside ports/env/readinessProbe) so Helm templating
(templates/gateway-deployment.yaml) and .Values overrides remain intact.

infra/helm/llmgateway/templates/postgresql-statefulset.yaml (1)

21-23: Image construction differs from other deployments.

The PostgreSQL container image is constructed directly ({{ .Values.postgresql.image.repository }}:{{ .Values.postgresql.image.tag }}) rather than using the llmgateway.image helper used by other deployments. This is acceptable since PostgreSQL uses a different image structure (no registry prefix needed for official images), but worth noting for consistency.
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@infra/helm/llmgateway/templates/postgresql-statefulset.yaml` around lines 21
- 23, The PostgreSQL container image is being built inline using {{
.Values.postgresql.image.repository }}:{{ .Values.postgresql.image.tag }} which
diverges from other deployments that use the llmgateway image helper; update the
postgresql container spec to use the shared helper (e.g., call the
"llmgateway.image" template with the postgresql image values) or add a dedicated
helper (like "llmgateway.postgresqlImage") and use that in the StatefulSet so
image construction is consistent with other templates (change the image
attribute referenced in the postgresql container to use the helper instead of
direct concatenation).

🤖 Prompt for all review comments with AI agents

Verify each finding against the current code and only fix it if needed.

Inline comments:
In `@apps/docs/content/self-host.mdx`:
- Around line 187-195: Update the prerequisites section to include
Kubernetes/Helm requirements so the new "Helm Chart" instructions are
executable: state that a Kubernetes cluster (minimum version, e.g., v1.20+ or
your supported minimum), kubectl configured to the target cluster, Helm 3
installed, a default StorageClass or PVC capability, access to a container
registry (or imagePullSecrets) for the llmgateway image, and any required
RBAC/load‑balancer considerations; reference the "Helm Chart" heading in the
docs to ensure users see these prerequisites before the helm repo/add/install
commands.

In `@infra/helm/llmgateway/templates/api-deployment.yaml`:
- Around line 29-42: The initContainer loops that wait with nc for PostgreSQL
and Redis can hang indefinitely; modify the shell in both wait loops (the nc
loops that reference include "llmgateway.postgresql.host"/port and the
initContainer named wait-for-redis that references include
"llmgateway.redis.host"/port) to implement bounded retries or a timeout: add a
counter (e.g., max_retries=30), increment on each try, sleep between attempts,
and if the counter exceeds max_retries exit with non-zero (exit 1) so Kubernetes
can surface the failure and trigger restart policy; alternatively use a timeout
wrapper (timeout command) around nc and exit non-zero on timeout—apply the same
change to both the PostgreSQL and Redis wait blocks.
- Around line 22-77: Add Pod- and container-level security contexts: set a
podSecurityContext (e.g., runAsNonRoot: true, runAsUser: 1000, runAsGroup: 1000,
fsGroup: 1000) and add securityContext blocks for the initContainers
"wait-for-postgresql" and "wait-for-redis" and the main container "api" with
hardened settings (readOnlyRootFilesystem: true, allowPrivilegeEscalation:
false, capabilities: drop all, seccompProfile: runtime/default, runAsNonRoot:
true). Apply these fields to the Pod spec and to each container's
securityContext so both init and main containers inherit the hardened runtime
constraints.

In `@infra/helm/llmgateway/templates/code-deployment.yaml`:
- Around line 20-54: Add a hardened securityContext to the code container and
pod: under the Deployment spec add a pod-level securityContext and inside the
container block for the container named "code" (the block that contains image:
{{ include "llmgateway.image" (dict "context" . "image" .Values.code.image) }}
and readinessProbe/livenessProbe) add a container securityContext with
runAsNonRoot: true, runAsUser: 1000, seccompProfile type RuntimeDefault,
allowPrivilegeEscalation: false, and capabilities.drop: [ALL]; mirror the same
pod and container securityContext changes for the other deployments that use
containers named "api", "gateway", "ui", "playground", "docs", "admin", and
"worker" so .Values.*.image blocks receive the same baseline hardening.

In `@infra/helm/llmgateway/templates/configmap.yaml`:
- Around line 69-71: The ConfigMap currently defines KEEP_ALIVE_TIMEOUT_S twice
(under gateway.config and api.config) causing the latter to overwrite the
former; change the keys to component-specific names (e.g. replace
KEEP_ALIVE_TIMEOUT_S in the gateway block with GATEWAY_KEEP_ALIVE_TIMEOUT_S and
the one in the api block with API_KEEP_ALIVE_TIMEOUT_S) and update the
corresponding deployment/manifest templates that read these values so
gateway.config consumers read GATEWAY_KEEP_ALIVE_TIMEOUT_S and api.config
consumers read API_KEEP_ALIVE_TIMEOUT_S (ensure any references in templates,
envFrom/env entries or valueFrom lookups are updated to the new key names).
- Around line 118-120: The ConfigMap currently uses truthiness checks (e.g. '{{-
if .enableDataRetentionCleanup }}') which omit env entries when users
intentionally set false/""; change those guards to key-presence checks using
hasKey against .Values (e.g. replace the if with '{{- if hasKey .Values
"enableDataRetentionCleanup" }}' and render ENABLE_DATA_RETENTION_CLEANUP with
the quoted value), and apply the same pattern to the other boolean/numeric
guards in this template so the env var is emitted whenever the key exists
regardless of its falsy value.

In `@infra/helm/llmgateway/templates/ingress.yaml`:
- Around line 26-110: The template unconditionally emits the Ingress "rules:"
key which can produce an empty rules array when none of the components are
enabled; update the template to only render "rules:" when at least one of the
component conditions is true (check the same conditions used for
ui/api/gateway/playground/code/docs/admin: .Values.ui.enabled and
.Values.ingress.hosts.ui, .Values.api.enabled and .Values.ingress.hosts.api,
etc.), or alternatively provide a safe fallback by rendering a defaultBackend
when none of those conditions are met; locate the block around the "rules:" line
and wrap it with a single conditional that ORs all seven checks (or add a
defaultBackend section under spec) so Kubernetes never receives an empty rules
array.

In `@infra/helm/llmgateway/templates/redis-statefulset.yaml`:
- Around line 18-65: Add a Pod and container security hardening baseline to the
Redis StatefulSet: in the StatefulSet spec add podSecurityContext with
runAsUser: 999, runAsGroup: 999 and fsGroup: 999 and add an emptyDir volume
mounted at /tmp; then on the container named "redis" set securityContext to
disallow privilege escalation (allowPrivilegeEscalation: false), drop all
capabilities, set readOnlyRootFilesystem: true, and set seccompProfile to
runtime/default so the readOnlyRootFilesystem works; ensure the container still
mounts the /tmp emptyDir and that the names (podSecurityContext,
securityContext, container "redis") match existing fields in the template.

In `@infra/helm/llmgateway/templates/secret.yaml`:
- Around line 15-20: The DATABASE_URL construction uses raw passwords and can
break with special chars; update the template to url-encode the password using
the Helm/Go template urlquery function for both .Values.postgresql.password and
.Values.externalPostgresql.password when building DATABASE_URL (refer to the
DATABASE_URL line and $pgUser/$pgHost/$pgPort/$pgDB symbols), and ensure
POSTGRES_PASSWORD is set consistently for runtime consumers (move or add
POSTGRES_PASSWORD outside the if/else or set it from the externalPostgresql
branch as well so containers always receive POSTGRES_PASSWORD).

In `@infra/helm/llmgateway/templates/ui-deployment.yaml`:
- Line 9: The replicas line treats 0 as empty because default considers zero
value empty; change the template to check whether the replicas key is present
and only fall back to 1 when it's absent. Replace the single-line "replicas: {{
.Values.ui.replicas | default 1 }}" with a conditional using hasKey on
.Values.ui (e.g., if hasKey .Values.ui "replicas" then render "replicas: {{
.Values.ui.replicas }}" else render "replicas: 1" end) so an explicit
ui.replicas: 0 is preserved.

In `@infra/helm/llmgateway/values.yaml`:
- Around line 43-47: The values.yaml currently contains placeholder secrets
(auth.authSecret and auth.gatewayApiKeyHashSecret) that must not be used in
production; update Helm templates (e.g., secret.yaml or a new validation
template) to validate these keys at render time by checking
.Values.auth.authSecret and .Values.auth.gatewayApiKeyHashSecret and failing the
install if they still equal the default placeholders (or use the required
function to enforce non-empty, plus explicit equality checks to fail when the
exact default strings are present), emitting clear messages like
"auth.authSecret must be changed from the default value" and
"auth.gatewayApiKeyHashSecret must be changed from the default value".
- Around line 292-299: The default insecure password is set at
postgresql.password ("changeme"); change this to an empty value or require a
user-supplied secret and add validation to ensure postgresql.password is not the
default. Update the helm values template to treat postgresql.password as
optional (empty by default) and/or add the same validation logic used for auth
secrets to fail deployment if postgresql.password == "changeme" or left
unchanged, referencing the postgresql block (postgresql.enabled,
postgresql.user, postgresql.password) so the chart forces explicit credential
configuration.

---

Nitpick comments:
In `@infra/helm/llmgateway/Chart.yaml`:
- Line 7: Replace the non-deterministic appVersion value in Chart.yaml
(appVersion: "latest") with a concrete application version string that matches
the chart’s default image tag; update the appVersion field to the actual
semantic version (e.g., "v1.2.3") and, if necessary, align values.yaml image.tag
(or any default image tag referenced by the chart) and CI/release automation to
keep them in sync so chart metadata is deterministic and traceable.

In `@infra/helm/llmgateway/templates/gateway-deployment.yaml`:
- Around line 22-42: The init containers wait-for-postgresql and wait-for-redis
can hang forever; modify their command scripts to implement a timeout or
max-retry and exit non-zero when exceeded so the pod fails fast. Update the
command blocks for the initContainers (names: wait-for-postgresql,
wait-for-redis) to track either a retry counter or elapsed time (e.g.,
MAX_RETRIES or TIMEOUT_SECONDS), loop until nc -z succeeds or the limit is
reached, print a clear error message and exit 1 on timeout; optionally expose
the timeout/retry via a Helm value so it’s configurable.
- Around line 43-77: Add a restrictive security context to the gateway container
and a default pod securityContext for defense-in-depth: under the containers
block for the container named "gateway" add a securityContext with
readOnlyRootFilesystem: true, runAsNonRoot: true, runAsUser (e.g. 1000),
allowPrivilegeEscalation: false, capabilities: drop: ["ALL"], and
seccompProfile: { type: "RuntimeDefault" }; additionally add a
podSecurityContext on the Pod template spec (spec.template.spec) to set fsGroup
(e.g. 1000) and runAsNonRoot:true to provide a default baseline for all
containers. Ensure you add these keys adjacent to existing fields (e.g.,
alongside ports/env/readinessProbe) so Helm templating
(templates/gateway-deployment.yaml) and .Values overrides remain intact.

In `@infra/helm/llmgateway/templates/ingress.yaml`:
- Around line 27-110: Replace the repeated per-component ingress blocks with a
single range that iterates over a map/list of component keys and generates host,
enabled check and service name dynamically; e.g., range over
["ui","api","gateway","playground","code","docs","admin"] and for each key,
check .Values.<key>.enabled and .Values.ingress.hosts.<key>, use the host value
and build the backend service name with include "llmgateway.fullname" . | printf
"%s-%s" (include "llmgateway.fullname" .) key and port name http; this keeps the
logic centralized (references: .Values.ui.enabled, .Values.ingress.hosts.<key>,
include "llmgateway.fullname") and removes the duplicated blocks.

In `@infra/helm/llmgateway/templates/postgresql-statefulset.yaml`:
- Around line 21-23: The PostgreSQL container image is being built inline using
{{ .Values.postgresql.image.repository }}:{{ .Values.postgresql.image.tag }}
which diverges from other deployments that use the llmgateway image helper;
update the postgresql container spec to use the shared helper (e.g., call the
"llmgateway.image" template with the postgresql image values) or add a dedicated
helper (like "llmgateway.postgresqlImage") and use that in the StatefulSet so
image construction is consistent with other templates (change the image
attribute referenced in the postgresql container to use the helper instead of
direct concatenation).

🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

Push a commit to this branch (recommended)
Create a new PR with the fixes

ℹ️ Review info

⚙️ Run configuration

Configuration used: Repository UI

Review profile: CHILL

Plan: Pro

Run ID: 8bc06efd-b953-4858-8ade-ce75457edce6

📥 Commits

Reviewing files that changed from the base of the PR and between 28d9ada and db4e2a5.

📒 Files selected for processing (27)

apps/docs/content/self-host.mdx
infra/helm/README.md
infra/helm/llmgateway/Chart.yaml
infra/helm/llmgateway/templates/_helpers.tpl
infra/helm/llmgateway/templates/admin-deployment.yaml
infra/helm/llmgateway/templates/admin-service.yaml
infra/helm/llmgateway/templates/api-deployment.yaml
infra/helm/llmgateway/templates/api-service.yaml
infra/helm/llmgateway/templates/code-deployment.yaml
infra/helm/llmgateway/templates/code-service.yaml
infra/helm/llmgateway/templates/configmap.yaml
infra/helm/llmgateway/templates/docs-deployment.yaml
infra/helm/llmgateway/templates/docs-service.yaml
infra/helm/llmgateway/templates/gateway-deployment.yaml
infra/helm/llmgateway/templates/gateway-service.yaml
infra/helm/llmgateway/templates/ingress.yaml
infra/helm/llmgateway/templates/playground-deployment.yaml
infra/helm/llmgateway/templates/playground-service.yaml
infra/helm/llmgateway/templates/postgresql-service.yaml
infra/helm/llmgateway/templates/postgresql-statefulset.yaml
infra/helm/llmgateway/templates/redis-service.yaml
infra/helm/llmgateway/templates/redis-statefulset.yaml
infra/helm/llmgateway/templates/secret.yaml
infra/helm/llmgateway/templates/ui-deployment.yaml
infra/helm/llmgateway/templates/ui-service.yaml
infra/helm/llmgateway/templates/worker-deployment.yaml
infra/helm/llmgateway/values.yaml

coderabbitai · 2026-04-09T22:46:26Z

+## Helm Chart
+
+You can also deploy LLMGateway using the Helm chart.
+
+```bash
+helm repo add llmgateway https://llmgateway.io/helm
+helm repo update
+helm install llmgateway llmgateway/llmgateway
+```


⚠️ Potential issue | 🟠 Major

Add Helm-specific prerequisites to avoid failed onboarding.

Line [187] adds a Helm deployment path, but prerequisites are still Docker-only. Please add Kubernetes cluster + Helm requirements in the prerequisites section so this option is executable end-to-end.

📘 Proposed doc patch

## Prerequisites - Latest Docker +- A Kubernetes cluster (local or managed) +- Helm v3 - API keys for the LLM providers you want to use (OpenAI, Anthropic, etc.)

🤖 Prompt for AI Agents

Verify each finding against the current code and only fix it if needed. In `@apps/docs/content/self-host.mdx` around lines 187 - 195, Update the prerequisites section to include Kubernetes/Helm requirements so the new "Helm Chart" instructions are executable: state that a Kubernetes cluster (minimum version, e.g., v1.20+ or your supported minimum), kubectl configured to the target cluster, Helm 3 installed, a default StorageClass or PVC capability, access to a container registry (or imagePullSecrets) for the llmgateway image, and any required RBAC/load‑balancer considerations; reference the "Helm Chart" heading in the docs to ensure users see these prerequisites before the helm repo/add/install commands.

coderabbitai · 2026-04-09T22:46:26Z

+      initContainers:
+        - name: wait-for-postgresql
+          image: busybox:1.37
+          command:
+            - sh
+            - -c
+            - |
+              until nc -z {{ include "llmgateway.postgresql.host" . }} {{ include "llmgateway.postgresql.port" . }}; do
+                echo "Waiting for PostgreSQL..."
+                sleep 2
+              done
+        - name: wait-for-redis
+          image: busybox:1.37
+          command:
+            - sh
+            - -c
+            - |
+              until nc -z {{ include "llmgateway.redis.host" . }} {{ include "llmgateway.redis.port" . }}; do
+                echo "Waiting for Redis..."
+                sleep 2
+              done
+      containers:
+        - name: api
+          image: {{ include "llmgateway.image" (dict "context" . "image" .Values.api.image) }}
+          imagePullPolicy: {{ .Values.api.image.pullPolicy | default "IfNotPresent" }}
+          ports:
+            - name: http
+              containerPort: 80
+              protocol: TCP
+          envFrom:
+            - configMapRef:
+                name: {{ include "llmgateway.configMapName" . }}
+            - secretRef:
+                name: {{ include "llmgateway.secretName" . }}
+          {{- with .Values.api.extraEnv }}
+          env:
+            {{- toYaml . | nindent 12 }}
+          {{- end }}
+          readinessProbe:
+            httpGet:
+              path: /
+              port: http
+            initialDelaySeconds: 10
+            periodSeconds: 10
+            timeoutSeconds: 5
+            failureThreshold: 3
+          livenessProbe:
+            httpGet:
+              path: /
+              port: http
+            initialDelaySeconds: 30
+            periodSeconds: 30
+            timeoutSeconds: 10
+            failureThreshold: 3
+          resources:
+            {{- toYaml .Values.api.resources | nindent 12 }}


⚠️ Potential issue | 🟠 Major

🧩 Analysis chain

🏁 Script executed:

find . -name "api-deployment.yaml" -type f

Repository: theopenco/llmgateway

Length of output: 117

🏁 Script executed:

find . -name "values.yaml" -path "*/llmgateway*" -type f

Repository: theopenco/llmgateway

Length of output: 99

🏁 Script executed:

cat -n ./infra/helm/llmgateway/templates/api-deployment.yaml

Repository: theopenco/llmgateway

Length of output: 3634

🏁 Script executed:

cat -n ./infra/helm/llmgateway/values.yaml

Repository: theopenco/llmgateway

Length of output: 14853

🏁 Script executed:

find ./infra/helm/llmgateway/templates -name "*deployment.yaml" -type f

Repository: theopenco/llmgateway

Length of output: 512

🏁 Script executed:

rg "securityContext" ./infra/helm/llmgateway/templates/ || echo "No securityContext found in templates"

Repository: theopenco/llmgateway

Length of output: 101

Add pod and container security contexts to harden the deployment.

The api-deployment.yaml (lines 20-77) runs containers with unrestricted security defaults. Both init and main containers lack security context settings, allowing unnecessary privilege surface.

Add the following hardening to reduce attack surface:

🔒 Security hardening changes

spec: + securityContext: + runAsNonRoot: true + seccompProfile: + type: RuntimeDefault {{- include "llmgateway.imagePullSecrets" . | nindent 6 }} initContainers: - name: wait-for-postgresql + securityContext: + allowPrivilegeEscalation: false + readOnlyRootFilesystem: true + capabilities: + drop: ["ALL"] image: busybox:1.37 @@ - name: wait-for-redis + securityContext: + allowPrivilegeEscalation: false + readOnlyRootFilesystem: true + capabilities: + drop: ["ALL"] image: busybox:1.37 @@ - name: api + securityContext: + allowPrivilegeEscalation: false + readOnlyRootFilesystem: true + capabilities: + drop: ["ALL"] image: {{ include "llmgateway.image" (dict "context" . "image" .Values.api.image) }}

🧰 Tools

🪛 Trivy (0.69.3)

[error] 32-37: Root file system is not read-only

Container 'wait-for-postgresql' of Deployment 'llmgateway-api' should set 'securityContext.readOnlyRootFilesystem' to true

Rule: KSV-0014

Learn more

(IaC/Kubernetes)

[error] 42-47: Root file system is not read-only

Container 'wait-for-redis' of Deployment 'llmgateway-api' should set 'securityContext.readOnlyRootFilesystem' to true

Rule: KSV-0014

Learn more

(IaC/Kubernetes)

[error] 32-37: Default security context configured

container llmgateway-api in default namespace is using the default security context

Rule: KSV-0118

Learn more

(IaC/Kubernetes)

[error] 42-47: Default security context configured

container llmgateway-api in default namespace is using the default security context

Rule: KSV-0118

Learn more

(IaC/Kubernetes)

🤖 Prompt for AI Agents

Verify each finding against the current code and only fix it if needed. In `@infra/helm/llmgateway/templates/api-deployment.yaml` around lines 22 - 77, Add Pod- and container-level security contexts: set a podSecurityContext (e.g., runAsNonRoot: true, runAsUser: 1000, runAsGroup: 1000, fsGroup: 1000) and add securityContext blocks for the initContainers "wait-for-postgresql" and "wait-for-redis" and the main container "api" with hardened settings (readOnlyRootFilesystem: true, allowPrivilegeEscalation: false, capabilities: drop all, seccompProfile: runtime/default, runAsNonRoot: true). Apply these fields to the Pod spec and to each container's securityContext so both init and main containers inherit the hardened runtime constraints.

coderabbitai · 2026-04-09T22:46:26Z

+              until nc -z {{ include "llmgateway.postgresql.host" . }} {{ include "llmgateway.postgresql.port" . }}; do
+                echo "Waiting for PostgreSQL..."
+                sleep 2
+              done
+        - name: wait-for-redis
+          image: busybox:1.37
+          command:
+            - sh
+            - -c
+            - |
+              until nc -z {{ include "llmgateway.redis.host" . }} {{ include "llmgateway.redis.port" . }}; do
+                echo "Waiting for Redis..."
+                sleep 2
+              done


⚠️ Potential issue | 🟠 Major

Avoid infinite dependency wait loops in initContainers.

Lines 29-42 can block forever if PostgreSQL/Redis are unreachable, leaving pods stuck in Init without a terminal signal. Add bounded retries (or timeout + non-zero exit) so failures surface and recover via restart policy.

🧰 Tools

🪛 Trivy (0.69.3)

[error] 32-37: Root file system is not read-only

Container 'wait-for-postgresql' of Deployment 'llmgateway-api' should set 'securityContext.readOnlyRootFilesystem' to true

Rule: KSV-0014

Learn more

(IaC/Kubernetes)

[error] 32-37: Default security context configured

container llmgateway-api in default namespace is using the default security context

Rule: KSV-0118

Learn more

(IaC/Kubernetes)

[error] 29-87: Default security context configured

deployment llmgateway-api in default namespace is using the default security context, which allows root privileges

Rule: KSV-0118

Learn more

(IaC/Kubernetes)

🤖 Prompt for AI Agents

Verify each finding against the current code and only fix it if needed. In `@infra/helm/llmgateway/templates/api-deployment.yaml` around lines 29 - 42, The initContainer loops that wait with nc for PostgreSQL and Redis can hang indefinitely; modify the shell in both wait loops (the nc loops that reference include "llmgateway.postgresql.host"/port and the initContainer named wait-for-redis that references include "llmgateway.redis.host"/port) to implement bounded retries or a timeout: add a counter (e.g., max_retries=30), increment on each try, sleep between attempts, and if the counter exceeds max_retries exit with non-zero (exit 1) so Kubernetes can surface the failure and trigger restart policy; alternatively use a timeout wrapper (timeout command) around nc and exit non-zero on timeout—apply the same change to both the PostgreSQL and Redis wait blocks.

coderabbitai · 2026-04-09T22:46:26Z

+    spec:
+      {{- include "llmgateway.imagePullSecrets" . | nindent 6 }}
+      containers:
+        - name: code
+          image: {{ include "llmgateway.image" (dict "context" . "image" .Values.code.image) }}
+          imagePullPolicy: {{ .Values.code.image.pullPolicy | default "IfNotPresent" }}
+          ports:
+            - name: http
+              containerPort: 80
+              protocol: TCP
+          envFrom:
+            - configMapRef:
+                name: {{ include "llmgateway.configMapName" . }}
+            - secretRef:
+                name: {{ include "llmgateway.secretName" . }}
+          {{- with .Values.code.extraEnv }}
+          env:
+            {{- toYaml . | nindent 12 }}
+          {{- end }}
+          readinessProbe:
+            httpGet:
+              path: /api/health
+              port: http
+            initialDelaySeconds: 10
+            periodSeconds: 10
+            timeoutSeconds: 5
+          livenessProbe:
+            httpGet:
+              path: /api/health
+              port: http
+            initialDelaySeconds: 20
+            periodSeconds: 30
+            timeoutSeconds: 5
+          resources:
+            {{- toYaml .Values.code.resources | nindent 12 }}


⚠️ Potential issue | 🟠 Major

🧩 Analysis chain

🏁 Script executed:

#!/bin/bash # Find the deployment file find . -name "code-deployment.yaml" -type f

Repository: theopenco/llmgateway

Length of output: 118

🏁 Script executed:

#!/bin/bash # Read the full deployment file to see the complete context if [ -f "infra/helm/llmgateway/templates/code-deployment.yaml" ]; then cat -n "infra/helm/llmgateway/templates/code-deployment.yaml" fi

Repository: theopenco/llmgateway

Length of output: 2745

🏁 Script executed:

#!/bin/bash # Check the values file to see if security contexts are defined there find infra/helm/llmgateway -name "values*.yaml" -type f -exec cat -n {} +

Repository: theopenco/llmgateway

Length of output: 14853

🏁 Script executed:

#!/bin/bash # Check other deployment templates for security context examples find infra/helm/llmgateway/templates -name "*deployment.yaml" -type f | while read file; do echo "=== File: $file ===" grep -n "securityContext\|runAsNonRoot\|seccomp\|capabilities" "$file" if [ $? -ne 0 ]; then echo "No security context found" fi done

Repository: theopenco/llmgateway

Length of output: 816

🏁 Script executed:

#!/bin/bash # Check if there are any helper templates that define security contexts find infra/helm/llmgateway/templates -type f -name "*" | head -20

Repository: theopenco/llmgateway

Length of output: 1102

🏁 Script executed:

#!/bin/bash # Check the helpers template cat -n infra/helm/llmgateway/templates/_helpers.tpl

Repository: theopenco/llmgateway

Length of output: 5950

Add securityContext to the code container.

The code container lacks hardened security settings. Add pod and container security contexts:

securityContext: runAsNonRoot: true runAsUser: 1000 seccompProfile: type: RuntimeDefault allowPrivilegeEscalation: false capabilities: drop: - ALL

Also apply these settings to other deployments (api, gateway, ui, playground, docs, admin, worker) for consistent baseline security hardening.

🤖 Prompt for AI Agents

Verify each finding against the current code and only fix it if needed. In `@infra/helm/llmgateway/templates/code-deployment.yaml` around lines 20 - 54, Add a hardened securityContext to the code container and pod: under the Deployment spec add a pod-level securityContext and inside the container block for the container named "code" (the block that contains image: {{ include "llmgateway.image" (dict "context" . "image" .Values.code.image) }} and readinessProbe/livenessProbe) add a container securityContext with runAsNonRoot: true, runAsUser: 1000, seccompProfile type RuntimeDefault, allowPrivilegeEscalation: false, and capabilities.drop: [ALL]; mirror the same pod and container securityContext changes for the other deployments that use containers named "api", "gateway", "ui", "playground", "docs", "admin", and "worker" so .Values.*.image blocks receive the same baseline hardening.

coderabbitai · 2026-04-09T22:46:26Z

+  {{- if .keepAliveTimeoutS }}
+  KEEP_ALIVE_TIMEOUT_S: {{ .keepAliveTimeoutS | quote }}
+  {{- end }}


⚠️ Potential issue | 🟠 Major

🧩 Analysis chain

🏁 Script executed:

# First, locate and examine the configmap.yaml file fd -type f -name "configmap.yaml" | grep -i llmgateway

Repository: theopenco/llmgateway

Length of output: 234

🏁 Script executed:

# Read the entire configmap.yaml to understand structure cat -n infra/helm/llmgateway/templates/configmap.yaml

Repository: theopenco/llmgateway

Length of output: 6290

KEEP_ALIVE_TIMEOUT_S is defined twice in the same ConfigMap data map.

Lines 70 and 90 both render this key under different contexts (gateway.config and api.config). When both conditions are true, the second occurrence (line 90) overwrites the first, breaking per-component timeout tuning.

Rename to component-specific keys (GATEWAY_KEEP_ALIVE_TIMEOUT_S and API_KEEP_ALIVE_TIMEOUT_S) and wire each component deployment to its corresponding key.

🤖 Prompt for AI Agents

Verify each finding against the current code and only fix it if needed. In `@infra/helm/llmgateway/templates/configmap.yaml` around lines 69 - 71, The ConfigMap currently defines KEEP_ALIVE_TIMEOUT_S twice (under gateway.config and api.config) causing the latter to overwrite the former; change the keys to component-specific names (e.g. replace KEEP_ALIVE_TIMEOUT_S in the gateway block with GATEWAY_KEEP_ALIVE_TIMEOUT_S and the one in the api block with API_KEEP_ALIVE_TIMEOUT_S) and update the corresponding deployment/manifest templates that read these values so gateway.config consumers read GATEWAY_KEEP_ALIVE_TIMEOUT_S and api.config consumers read API_KEEP_ALIVE_TIMEOUT_S (ensure any references in templates, envFrom/env entries or valueFrom lookups are updated to the new key names).

coderabbitai · 2026-04-09T22:46:26Z

+    spec:
+      {{- include "llmgateway.imagePullSecrets" . | nindent 6 }}
+      containers:
+        - name: redis
+          image: {{ .Values.redis.image.repository }}:{{ .Values.redis.image.tag }}
+          imagePullPolicy: {{ .Values.redis.image.pullPolicy | default "IfNotPresent" }}
+          ports:
+            - name: redis
+              containerPort: 6379
+              protocol: TCP
+          {{- if .Values.redis.password }}
+          command:
+            - redis-server
+            - --requirepass
+            - $(REDIS_PASSWORD)
+          env:
+            - name: REDIS_PASSWORD
+              valueFrom:
+                secretKeyRef:
+                  name: {{ include "llmgateway.secretName" . }}
+                  key: REDIS_PASSWORD
+          {{- end }}
+          readinessProbe:
+            exec:
+              command:
+                - redis-cli
+                {{- if .Values.redis.password }}
+                - -a
+                - $(REDIS_PASSWORD)
+                {{- end }}
+                - ping
+            initialDelaySeconds: 5
+            periodSeconds: 10
+            timeoutSeconds: 5
+          livenessProbe:
+            exec:
+              command:
+                - redis-cli
+                {{- if .Values.redis.password }}
+                - -a
+                - $(REDIS_PASSWORD)
+                {{- end }}
+                - ping
+            initialDelaySeconds: 15
+            periodSeconds: 30
+            timeoutSeconds: 5
+          resources:
+            {{- toYaml .Values.redis.resources | nindent 12 }}


⚠️ Potential issue | 🟠 Major

🧩 Analysis chain

🏁 Script executed:

cd /tmp && find . -type f -name "redis-statefulset.yaml" 2>/dev/null | head -20

Repository: theopenco/llmgateway

Length of output: 46

🏁 Script executed:

find . -type f -name "*redis*statefulset*" -o -name "*redis-statefulset*" 2>/dev/null

Repository: theopenco/llmgateway

Length of output: 120

🏁 Script executed:

fd -i redis infra/helm 2>/dev/null

Repository: theopenco/llmgateway

Length of output: 169

🏁 Script executed:

ls -la infra/helm/llmgateway/templates/ 2>/dev/null | head -20

Repository: theopenco/llmgateway

Length of output: 1354

🏁 Script executed:

cat infra/helm/llmgateway/templates/redis-statefulset.yaml

Repository: theopenco/llmgateway

Length of output: 2695

🏁 Script executed:

grep -r "securityContext" infra/helm/llmgateway/templates/ --include="*.yaml"

Repository: theopenco/llmgateway

Length of output: 46

🏁 Script executed:

cat infra/helm/llmgateway/templates/api-deployment.yaml

Repository: theopenco/llmgateway

Length of output: 3032

🏁 Script executed:

cat infra/helm/llmgateway/templates/gateway-deployment.yaml

Repository: theopenco/llmgateway

Length of output: 3125

🏁 Script executed:

cat infra/helm/llmgateway/values.yaml | head -100

Repository: theopenco/llmgateway

Length of output: 2925

🏁 Script executed:

rg -A 10 "redis:" infra/helm/llmgateway/values.yaml

Repository: theopenco/llmgateway

Length of output: 281

🏁 Script executed:

cat infra/helm/llmgateway/templates/postgresql-statefulset.yaml

Repository: theopenco/llmgateway

Length of output: 3032

🌐 Web query:

Redis readOnlyRootFilesystem Kubernetes security context

💡 Result:

To run Redis on Kubernetes with readOnlyRootFilesystem: true, you must (a) keep the image root FS read-only and (b) provide explicit writable mounts for every path Redis (or the OS) needs to write to—typically /data and often /tmp.

Why it breaks without extra mounts

The official Redis image expects persistence files under /data (RDB/AOF) and commonly runs with /data as the writable location you mount a volume to. [1]

With a read-only root filesystem, writes to /tmp can also fail (many images/tools assume /tmp is writable), leading to CrashLoopBackOff unless you mount a writable volume there. [2]

Example (StatefulSet/Deployment container spec)

spec: securityContext: runAsNonRoot: true runAsUser: 999 runAsGroup: 999 fsGroup: 999 # helps make mounted volumes writable for the process containers: - name: redis image: redis:7 securityContext: readOnlyRootFilesystem: true allowPrivilegeEscalation: false capabilities: drop: ["ALL"] volumeMounts: - name: redis-data mountPath: /data - name: tmp mountPath: /tmp volumes: - name: tmp emptyDir: {} # For StatefulSet, use volumeClaimTemplates for redis-data; for Deployment, use a PVC volume.

Notes

fsGroup is commonly used so Kubernetes mounts can be writable by a non-root process (Kubernetes adjusts volume ownership/permissions to match the fsGroup when mounting). [3]

If you also set mounted volumes readOnly: true, be aware of historical Linux/Kubernetes edge-cases around “truly” recursive read-only mounts (mostly relevant when there are submounts). [4]

Sources: Redis official image docs [1], K8s /tmp workaround example [2], fsGroup behavior explanation [3], Kubernetes recursive read-only mount details [4].

Add pod/container security contexts for Redis.

Lines 18-65 currently rely on default security settings. For a database/cache workload, this should be hardened by default (non-root, no privilege escalation, dropped capabilities, runtime default seccomp).

The suggested hardening requires additional configuration for readOnlyRootFilesystem: true to work properly:

🔒 Suggested hardening baseline

spec: + securityContext: + runAsNonRoot: true + runAsUser: 999 + runAsGroup: 999 + fsGroup: 999 + seccompProfile: + type: RuntimeDefault {{- include "llmgateway.imagePullSecrets" . | nindent 6 }} containers: - name: redis + securityContext: + allowPrivilegeEscalation: false + readOnlyRootFilesystem: true + capabilities: + drop: ["ALL"] image: {{ .Values.redis.image.repository }}:{{ .Values.redis.image.tag }} imagePullPolicy: {{ .Values.redis.image.pullPolicy | default "IfNotPresent" }} ports: - name: redis containerPort: 6379 protocol: TCP {{- if .Values.redis.password }} command: - redis-server - --requirepass - $(REDIS_PASSWORD) env: - name: REDIS_PASSWORD valueFrom: secretKeyRef: name: {{ include "llmgateway.secretName" . }} key: REDIS_PASSWORD {{- end }} readinessProbe: exec: command: - redis-cli {{- if .Values.redis.password }} - -a - $(REDIS_PASSWORD) {{- end }} - ping initialDelaySeconds: 5 periodSeconds: 10 timeoutSeconds: 5 livenessProbe: exec: command: - redis-cli {{- if .Values.redis.password }} - -a - $(REDIS_PASSWORD) {{- end }} - ping initialDelaySeconds: 15 periodSeconds: 30 timeoutSeconds: 5 resources: {{- toYaml .Values.redis.resources | nindent 12 }} volumeMounts: - name: data mountPath: /data + - name: tmp + mountPath: /tmp volumes: + - name: tmp + emptyDir: {}

Note: fsGroup: 999 and /tmp emptyDir are required for the read-only filesystem to work; the runAsUser and runAsGroup should match the Redis image's unprivileged user UID/GID (typically 999 for redis:8-alpine).

🧰 Tools

🪛 Trivy (0.69.3)

[error] 30-62: Root file system is not read-only

Container 'redis' of StatefulSet 'llmgateway-redis' should set 'securityContext.readOnlyRootFilesystem' to true

Rule: KSV-0014

Learn more

(IaC/Kubernetes)

[error] 30-62: Default security context configured

container llmgateway-redis in default namespace is using the default security context

Rule: KSV-0118

Learn more

(IaC/Kubernetes)

[error] 27-62: Default security context configured

statefulset llmgateway-redis in default namespace is using the default security context, which allows root privileges

Rule: KSV-0118

Learn more

(IaC/Kubernetes)

🤖 Prompt for AI Agents

Verify each finding against the current code and only fix it if needed. In `@infra/helm/llmgateway/templates/redis-statefulset.yaml` around lines 18 - 65, Add a Pod and container security hardening baseline to the Redis StatefulSet: in the StatefulSet spec add podSecurityContext with runAsUser: 999, runAsGroup: 999 and fsGroup: 999 and add an emptyDir volume mounted at /tmp; then on the container named "redis" set securityContext to disallow privilege escalation (allowPrivilegeEscalation: false), drop all capabilities, set readOnlyRootFilesystem: true, and set seccompProfile to runtime/default so the readOnlyRootFilesystem works; ensure the container still mounts the /tmp emptyDir and that the names (podSecurityContext, securityContext, container "redis") match existing fields in the template.

coderabbitai · 2026-04-09T22:46:26Z

+  {{- if .Values.postgresql.enabled }}
+  DATABASE_URL: {{ printf "postgres://%s:%s@%s:%s/%s" $pgUser .Values.postgresql.password $pgHost $pgPort $pgDB | quote }}
+  POSTGRES_PASSWORD: {{ .Values.postgresql.password | quote }}
+  {{- else }}
+  DATABASE_URL: {{ printf "postgres://%s:%s@%s:%s/%s" $pgUser .Values.externalPostgresql.password $pgHost $pgPort $pgDB | quote }}
+  {{- end }}


⚠️ Potential issue | 🟡 Minor

Password in DATABASE_URL should be URL-encoded.

If the PostgreSQL password contains special characters (e.g., @, :, /, %), the connection string will be malformed. Use urlquery to encode the password.

Also note: POSTGRES_PASSWORD is only set when using the built-in PostgreSQL (line 17), not for external PostgreSQL. If the worker/api containers expect this env var regardless, it may cause issues.

🔧 Proposed fix for URL encoding

{{- if .Values.postgresql.enabled }} - DATABASE_URL: {{ printf "postgres://%s:%s@%s:%s/%s" $pgUser .Values.postgresql.password $pgHost $pgPort $pgDB | quote }} + DATABASE_URL: {{ printf "postgres://%s:%s@%s:%s/%s" $pgUser (.Values.postgresql.password | urlquery) $pgHost $pgPort $pgDB | quote }} POSTGRES_PASSWORD: {{ .Values.postgresql.password | quote }} {{- else }} - DATABASE_URL: {{ printf "postgres://%s:%s@%s:%s/%s" $pgUser .Values.externalPostgresql.password $pgHost $pgPort $pgDB | quote }} + DATABASE_URL: {{ printf "postgres://%s:%s@%s:%s/%s" $pgUser (.Values.externalPostgresql.password | urlquery) $pgHost $pgPort $pgDB | quote }} {{- end }}

📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change

{{- if .Values.postgresql.enabled }}

DATABASE_URL: {{ printf "postgres://%s:%s@%s:%s/%s" $pgUser .Values.postgresql.password $pgHost $pgPort $pgDB | quote }}

POSTGRES_PASSWORD: {{ .Values.postgresql.password | quote }}

{{- else }}

DATABASE_URL: {{ printf "postgres://%s:%s@%s:%s/%s" $pgUser .Values.externalPostgresql.password $pgHost $pgPort $pgDB | quote }}

{{- end }}

{{- if .Values.postgresql.enabled }}

DATABASE_URL: {{ printf "postgres://%s:%s@%s:%s/%s" $pgUser (.Values.postgresql.password | urlquery) $pgHost $pgPort $pgDB | quote }}

POSTGRES_PASSWORD: {{ .Values.postgresql.password | quote }}

{{- else }}

DATABASE_URL: {{ printf "postgres://%s:%s@%s:%s/%s" $pgUser (.Values.externalPostgresql.password | urlquery) $pgHost $pgPort $pgDB | quote }}

{{- end }}

🧰 Tools

🪛 Checkov (3.2.513)

[medium] 16-17: Basic Auth Credentials

(CKV_SECRET_4)

🤖 Prompt for AI Agents

Verify each finding against the current code and only fix it if needed. In `@infra/helm/llmgateway/templates/secret.yaml` around lines 15 - 20, The DATABASE_URL construction uses raw passwords and can break with special chars; update the template to url-encode the password using the Helm/Go template urlquery function for both .Values.postgresql.password and .Values.externalPostgresql.password when building DATABASE_URL (refer to the DATABASE_URL line and $pgUser/$pgHost/$pgPort/$pgDB symbols), and ensure POSTGRES_PASSWORD is set consistently for runtime consumers (move or add POSTGRES_PASSWORD outside the if/else or set it from the externalPostgresql branch as well so containers always receive POSTGRES_PASSWORD).

coderabbitai · 2026-04-09T22:46:26Z

+  labels:
+    {{- include "llmgateway.componentLabels" (dict "context" . "component" "ui") | nindent 4 }}
+spec:
+  replicas: {{ .Values.ui.replicas | default 1 }}


⚠️ Potential issue | 🟡 Minor

🧩 Analysis chain

🌐 Web query:

Helm Sprig default filter zero numeric value behavior Go template

💡 Result:

In Helm templates, default comes from Sprig and returns the fallback when the given value is considered “empty”. For numerics, 0 is defined as empty, so default will replace an explicitly-set 0 with the default value. Example: {{ .Values.replicas | default 1 }} renders 1 when .Values.replicas is 0. [1]

If you need to distinguish “unset” vs “set to 0”, don’t use default on the number directly—check key existence instead:

{{- if hasKey .Values "replicas" -}} {{ .Values.replicas }} {{- else -}} 1 {{- end -}}

hasKey returns true when the dict contains the key. [2]

Sources: Sprig “Default Functions” docs [1], Helm template function list (hasKey) [2].

🏁 Script executed:

# Verify the file exists and check the exact content at line 9 fd -t f "ui-deployment.yaml" infra/

Repository: theopenco/llmgateway

Length of output: 114

🏁 Script executed:

cat -n infra/helm/llmgateway/templates/ui-deployment.yaml | head -20

Repository: theopenco/llmgateway

Length of output: 997

Preserve explicit replicas: 0 instead of forcing 1.

On Line 9, default treats 0 as empty, so ui.replicas: 0 still renders as 1. This prevents scale-to-zero deployments. Use hasKey to distinguish between unset and explicitly-set zero values.

Proposed fix

- replicas: {{ .Values.ui.replicas | default 1 }} + {{- if hasKey .Values.ui "replicas" }} + replicas: {{ .Values.ui.replicas }} + {{- else }} + replicas: 1 + {{- end }}

📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change

replicas: {{ .Values.ui.replicas | default 1 }}

{{- if hasKey .Values.ui "replicas" }}

replicas: {{ .Values.ui.replicas }}

{{- else }}

replicas: 1

{{- end }}

🤖 Prompt for AI Agents

Verify each finding against the current code and only fix it if needed. In `@infra/helm/llmgateway/templates/ui-deployment.yaml` at line 9, The replicas line treats 0 as empty because default considers zero value empty; change the template to check whether the replicas key is present and only fall back to 1 when it's absent. Replace the single-line "replicas: {{ .Values.ui.replicas | default 1 }}" with a conditional using hasKey on .Values.ui (e.g., if hasKey .Values.ui "replicas" then render "replicas: {{ .Values.ui.replicas }}" else render "replicas: 1" end) so an explicit ui.replicas: 0 is preserved.

coderabbitai · 2026-04-09T22:46:26Z

+auth:
+  # -- (required) Secret used for signing auth tokens (32+ characters)
+  authSecret: "change-me-to-a-random-32-char-secret"
+  # -- (required) HMAC secret for hashing API keys
+  gatewayApiKeyHashSecret: "change-me-to-a-random-secret"


⚠️ Potential issue | 🟠 Major

Default secrets must be changed before production deployment.

The default values for authSecret and gatewayApiKeyHashSecret are placeholder strings. While comments indicate they need to be changed, consider:

Adding validation in the templates to fail if defaults are unchanged

Using required function in templates to enforce values

🛡️ Example template validation

Add to secret.yaml or a dedicated validation template:

{{- if eq .Values.auth.authSecret "change-me-to-a-random-32-char-secret" }} {{- fail "auth.authSecret must be changed from the default value" }} {{- end }} {{- if eq .Values.auth.gatewayApiKeyHashSecret "change-me-to-a-random-secret" }} {{- fail "auth.gatewayApiKeyHashSecret must be changed from the default value" }} {{- end }}

🤖 Prompt for AI Agents

Verify each finding against the current code and only fix it if needed. In `@infra/helm/llmgateway/values.yaml` around lines 43 - 47, The values.yaml currently contains placeholder secrets (auth.authSecret and auth.gatewayApiKeyHashSecret) that must not be used in production; update Helm templates (e.g., secret.yaml or a new validation template) to validate these keys at render time by checking .Values.auth.authSecret and .Values.auth.gatewayApiKeyHashSecret and failing the install if they still equal the default placeholders (or use the required function to enforce non-empty, plus explicit equality checks to fail when the exact default strings are present), emitting clear messages like "auth.authSecret must be changed from the default value" and "auth.gatewayApiKeyHashSecret must be changed from the default value".

coderabbitai · 2026-04-09T22:46:26Z

+postgresql:
+  enabled: true
+  image:
+    repository: postgres
+    tag: "17-alpine"
+    pullPolicy: IfNotPresent
+  user: postgres
+  password: "changeme"


⚠️ Potential issue | 🟠 Major

Default PostgreSQL password "changeme" is insecure.

The default password should either be empty (forcing users to set it) or the template should validate that it has been changed from the default. Same validation pattern as suggested for auth secrets could apply here.

🤖 Prompt for AI Agents

Verify each finding against the current code and only fix it if needed. In `@infra/helm/llmgateway/values.yaml` around lines 292 - 299, The default insecure password is set at postgresql.password ("changeme"); change this to an empty value or require a user-supplied secret and add validation to ensure postgresql.password is not the default. Update the helm values template to treat postgresql.password as optional (empty by default) and/or add the same validation logic used for auth secrets to fail deployment if postgresql.password == "changeme" or left unchanged, referencing the postgresql block (postgresql.enabled, postgresql.user, postgresql.password) so the chart forces explicit credential configuration.

steebchen · 2026-04-11T06:46:24Z

+You can also deploy LLMGateway using the Helm chart.
+
+```bash
+helm repo add llmgateway https://llmgateway.io/helm


@yezz123 this link needs to be fixed

Yes we should deploy it either in github pages or you can do it with the domain of llmgateway

coderabbitai

Actionable comments posted: 2

🤖 Prompt for all review comments with AI agents

Verify each finding against the current code and only fix it if needed.

Inline comments:
In @.github/workflows/images.yml:
- Around line 409-412: The publish-helm job is not waiting for image tests and
may publish a chart before images pass; update the publish-helm job to depend on
the test-split job by adding "test-split" to its needs list and extend the if
condition to require needs.test-split.result == 'success' (i.e., change needs:
to include test-split and change if: to github.event_name == 'release' &&
needs.merge-split.result == 'success' && needs.test-split.result == 'success')
so chart publishing is gated on both merge-split and image-test success.
- Around line 431-438: Add a strict SemVer validation step that verifies
IMAGE_TAG (and derived CHART_VERSION) conforms to Helm-compatible SemVer before
mutating Chart.yaml: check IMAGE_TAG/CHART_VERSION against a regex that
disallows underscores and matches full SemVer (e.g. allow optional
prerelease/build with segments of [0-9A-Za-z-] separated by dots), fail the job
and do not run the sed edits if it doesn't match; also replace the existing
release-tag validation pattern `(-[a-zA-Z0-9_-]+)*` with the stricter pattern
`(-[0-9A-Za-z-]+(\.[0-9A-Za-z-]+)*)?` so IMAGE_TAGs like `v1.2.3-rc_1` are
rejected before images are pushed.

🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

Push a commit to this branch (recommended)
Create a new PR with the fixes

ℹ️ Review info

⚙️ Run configuration

Configuration used: Repository UI

Review profile: CHILL

Plan: Pro

Run ID: 50bd52b5-09f9-4a3b-b47d-f8eba3e96347

📥 Commits

Reviewing files that changed from the base of the PR and between db4e2a5 and 96fa4dd.

📒 Files selected for processing (3)

.github/workflows/images.yml
apps/docs/content/self-host.mdx
infra/helm/README.md

✅ Files skipped from review due to trivial changes (2)

infra/helm/README.md
apps/docs/content/self-host.mdx

coderabbitai · 2026-04-19T11:05:53Z

+    needs:
+      - setup
+      - merge-split
+    if: github.event_name == 'release' && needs.merge-split.result == 'success'


⚠️ Potential issue | 🟠 Major

Wait for image tests before publishing the Helm chart.

publish-helm runs after merge-split, in parallel with test-split, so a release can publish a chart pointing at images that later fail the split-image tests. Gate chart publishing on the image test job too.

🚦 Proposed workflow gate

needs: - setup - merge-split - if: github.event_name == 'release' && needs.merge-split.result == 'success' + - test-split + if: github.event_name == 'release' && needs.merge-split.result == 'success' && needs.test-split.result == 'success'

📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change

needs:

- setup

- merge-split

if: github.event_name == 'release' && needs.merge-split.result == 'success'

needs:

- setup

- merge-split

- test-split

if: github.event_name == 'release' && needs.merge-split.result == 'success' && needs.test-split.result == 'success'

🤖 Prompt for AI Agents

Verify each finding against the current code and only fix it if needed. In @.github/workflows/images.yml around lines 409 - 412, The publish-helm job is not waiting for image tests and may publish a chart before images pass; update the publish-helm job to depend on the test-split job by adding "test-split" to its needs list and extend the if condition to require needs.test-split.result == 'success' (i.e., change needs: to include test-split and change if: to github.event_name == 'release' && needs.merge-split.result == 'success' && needs.test-split.result == 'success') so chart publishing is gated on both merge-split and image-test success.

coderabbitai · 2026-04-19T11:05:54Z

+      - name: Set chart version
+        working-directory: infra/helm/llmgateway
+        run: |
+          IMAGE_TAG="${{ needs.setup.outputs.image_tag }}"
+          CHART_VERSION="${IMAGE_TAG#v}"
+          sed -i "s/^version:.*/version: ${CHART_VERSION}/" Chart.yaml
+          sed -i "s/^appVersion:.*/appVersion: \"${IMAGE_TAG}\"/" Chart.yaml
+          cat Chart.yaml


⚠️ Potential issue | 🟠 Major

🧩 Analysis chain

🏁 Script executed:

#!/bin/bash # Verifies that the existing release-tag pattern accepts values rejected by Helm/SemVer-style chart versions. python - <<'PY' import re workflow_tag_re = re.compile(r'^v[0-9]+\.[0-9]+\.[0-9]+(-[a-zA-Z0-9_-]+)*$') helm_chart_version_re = re.compile( r'^(0|[1-9][0-9]*)\.' r'(0|[1-9][0-9]*)\.' r'(0|[1-9][0-9]*)' r'(-[0-9A-Za-z-]+(\.[0-9A-Za-z-]+)*)?$' ) for tag in ["v1.2.3", "v1.2.3-rc1", "v1.2.3-rc_1", "v1.2.3-rc.1"]: chart_version = tag[1:] print( tag, "workflow_accepts=", bool(workflow_tag_re.fullmatch(tag)), "helm_semver_accepts=", bool(helm_chart_version_re.fullmatch(chart_version)), ) PY

Repository: theopenco/llmgateway

Length of output: 303

🏁 Script executed:

# First, check the workflow file exists and read the relevant sections if [ -f .github/workflows/images.yml ]; then echo "=== Lines 40-50 (checking release-tag validation at line 44) ===" sed -n '40,50p' .github/workflows/images.yml echo "" echo "=== Lines 425-445 (checking chart version code at lines 431-438) ===" sed -n '425,445p' .github/workflows/images.yml else echo "File not found at .github/workflows/images.yml" find . -name "images.yml" -o -name "*images*.yml" 2>/dev/null | head -20 fi

Repository: theopenco/llmgateway

Length of output: 1502

🏁 Script executed:

# Create a minimal test to verify Helm behavior with underscores in version # This tests the actual Helm package behavior python3 <<'PYEOF' import re # Test the actual Helm SemVer spec # Per https://semver.org and Helm documentation helm_semver = re.compile( r'^(0|[1-9][0-9]*)\.' r'(0|[1-9][0-9]*)\.' r'(0|[1-9][0-9]*)' r'(-[0-9A-Za-z-]+(\.[0-9A-Za-z-]+)*)?$' ) workflow_pattern = re.compile(r'^v[0-9]+\.[0-9]+\.[0-9]+(-[a-zA-Z0-9_-]+)*$') test_cases = [ ("v1.2.3", "basic version"), ("v1.2.3-rc1", "prerelease no separator"), ("v1.2.3-rc_1", "prerelease with underscore"), ("v1.2.3-rc.1", "prerelease with dot"), ("v1.2.3-rc-1", "prerelease with dash"), ("v1.2.3-alpha_beta", "multiple underscore"), ] print("Testing workflow tag validation vs Helm SemVer compatibility:\n") print(f"{'Tag':<20} {'Workflow':<12} {'Helm SemVer':<15} {'Status':<40}") print("-" * 87) for tag, desc in test_cases: chart_version = tag[1:] if tag.startswith('v') else tag workflow_ok = bool(workflow_pattern.fullmatch(tag)) helm_ok = bool(helm_semver.fullmatch(chart_version)) if workflow_ok and not helm_ok: status = "❌ MISMATCH: Passes workflow, fails Helm" elif not workflow_ok and helm_ok: status = "⚠️ Inverse: Fails workflow, passes Helm" elif workflow_ok and helm_ok: status = "✓ Both pass" else: status = "✗ Both fail" print(f"{tag:<20} {str(workflow_ok):<12} {str(helm_ok):<15} {status:<40}") PYEOF

Repository: theopenco/llmgateway

Length of output: 851

Validate chart versions with Helm-compatible SemVer.

CHART_VERSION inherits the Docker tag validator at line 44, which allows underscores in prerelease suffixes (e.g., v1.2.3-rc_1). Such tags pass the workflow validation, images get pushed to the registry, but then fail at helm package due to Helm requiring strict SemVer. Add SemVer validation before mutating Chart.yaml and update the release-tag pattern at line 44 to enforce strict SemVer compliance.

🛡️ Proposed validation before mutating Chart.yaml

- name: Set chart version working-directory: infra/helm/llmgateway run: | IMAGE_TAG="${{ needs.setup.outputs.image_tag }}" CHART_VERSION="${IMAGE_TAG#v}" + if [[ ! "${CHART_VERSION}" =~ ^(0|[1-9][0-9]*)\.(0|[1-9][0-9]*)\.(0|[1-9][0-9]*)(-[0-9A-Za-z-]+(\.[0-9A-Za-z-]+)*)?$ ]]; then + echo "Invalid Helm chart version: ${CHART_VERSION}" + exit 1 + fi sed -i "s/^version:.*/version: ${CHART_VERSION}/" Chart.yaml sed -i "s/^appVersion:.*/appVersion: \"${IMAGE_TAG}\"/" Chart.yaml cat Chart.yaml

Also update line 44 validation pattern from (-[a-zA-Z0-9_-]+)* to (-[0-9A-Za-z-]+(\.[0-9A-Za-z-]+)*)? to reject underscores and enforce SemVer before pushing images.

📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change

- name: Set chart version

working-directory: infra/helm/llmgateway

run: |

IMAGE_TAG="${{ needs.setup.outputs.image_tag }}"

CHART_VERSION="${IMAGE_TAG#v}"

sed -i "s/^version:.*/version: ${CHART_VERSION}/" Chart.yaml

sed -i "s/^appVersion:.*/appVersion: \"${IMAGE_TAG}\"/" Chart.yaml

cat Chart.yaml

- name: Set chart version

working-directory: infra/helm/llmgateway

run: |

IMAGE_TAG="${{ needs.setup.outputs.image_tag }}"

CHART_VERSION="${IMAGE_TAG#v}"

if [[ ! "${CHART_VERSION}" =~ ^(0|[1-9][0-9]*)\.(0|[1-9][0-9]*)\.(0|[1-9][0-9]*)(-[0-9A-Za-z-]+(\.[0-9A-Za-z-]+)*)?$ ]]; then

echo "Invalid Helm chart version: ${CHART_VERSION}"

exit 1

fi

sed -i "s/^version:.*/version: ${CHART_VERSION}/" Chart.yaml

sed -i "s/^appVersion:.*/appVersion: \"${IMAGE_TAG}\"/" Chart.yaml

cat Chart.yaml

🤖 Prompt for AI Agents

Verify each finding against the current code and only fix it if needed. In @.github/workflows/images.yml around lines 431 - 438, Add a strict SemVer validation step that verifies IMAGE_TAG (and derived CHART_VERSION) conforms to Helm-compatible SemVer before mutating Chart.yaml: check IMAGE_TAG/CHART_VERSION against a regex that disallows underscores and matches full SemVer (e.g. allow optional prerelease/build with segments of [0-9A-Za-z-] separated by dots), fail the job and do not run the sed edits if it doesn't match; also replace the existing release-tag validation pattern `(-[a-zA-Z0-9_-]+)*` with the stricter pattern `(-[0-9A-Za-z-]+(\.[0-9A-Za-z-]+)*)?` so IMAGE_TAGs like `v1.2.3-rc_1` are rejected before images are pushed.

github-actions Bot changed the title ~~Add Helm chart and deployment instructions for LLMGateway~~ feat(helm): add LLMGateway helm chart Apr 9, 2026

yezz123 added 2 commits April 9, 2026 23:36

feat: add Helm chart for Kubernetes deployment

5f04047

docs: add Helm chart deployment instructions for LLMGateway

db4e2a5

yezz123 force-pushed the helm-integration branch from 81d9cd1 to db4e2a5 Compare April 9, 2026 22:36

coderabbitai Bot reviewed Apr 9, 2026

View reviewed changes

steebchen reviewed Apr 11, 2026

View reviewed changes

steebchen added 2 commits April 19, 2026 17:59

ci(helm): publish chart to GHCR OCI on release

5402f96

docs(helm): default install to latest published version

96fa4dd

coderabbitai Bot reviewed Apr 19, 2026

View reviewed changes

steebchen enabled auto-merge April 19, 2026 11:24

steebchen added this pull request to the merge queue Apr 19, 2026

Merged via the queue into theopenco:main with commit 9a46a1b Apr 19, 2026
25 checks passed

Conversation

yezz123 commented Apr 9, 2026 • edited by coderabbitai Bot Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Summary by CodeRabbit

Uh oh!

coderabbitai Bot commented Apr 9, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Walkthrough

Changes

Sequence Diagram(s)

Estimated code review effort

Possibly related PRs

Uh oh!

coderabbitai Bot left a comment

Choose a reason for hiding this comment

Uh oh!

coderabbitai Bot Apr 9, 2026

Choose a reason for hiding this comment

Uh oh!

coderabbitai Bot Apr 9, 2026

Choose a reason for hiding this comment

Uh oh!

coderabbitai Bot Apr 9, 2026

Choose a reason for hiding this comment

Uh oh!

coderabbitai Bot Apr 9, 2026

Choose a reason for hiding this comment

Uh oh!

coderabbitai Bot Apr 9, 2026

Choose a reason for hiding this comment

Uh oh!

coderabbitai Bot Apr 9, 2026

Choose a reason for hiding this comment

Why it breaks without extra mounts

Example (StatefulSet/Deployment container spec)

Notes

Uh oh!

coderabbitai Bot Apr 9, 2026

Choose a reason for hiding this comment

Uh oh!

coderabbitai Bot Apr 9, 2026

Choose a reason for hiding this comment

Uh oh!

coderabbitai Bot Apr 9, 2026

Choose a reason for hiding this comment

Uh oh!

coderabbitai Bot Apr 9, 2026

Choose a reason for hiding this comment

Uh oh!

steebchen Apr 11, 2026

Choose a reason for hiding this comment

Uh oh!

yezz123 Apr 12, 2026

Choose a reason for hiding this comment

Uh oh!

coderabbitai Bot left a comment

Choose a reason for hiding this comment

Uh oh!

coderabbitai Bot Apr 19, 2026

Choose a reason for hiding this comment

Uh oh!

coderabbitai Bot Apr 19, 2026

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

yezz123 commented Apr 9, 2026 •

edited by coderabbitai Bot

Loading

coderabbitai Bot commented Apr 9, 2026 •

edited

Loading