WAF: Block config updates on cold miss#5135
Conversation
There was a problem hiding this comment.
Pull request overview
This PR aims to maintain a fail-closed posture for WAF-enabled traffic on control-plane restart by preventing NGINX config pushes when WAF bundles have never been successfully fetched, and by introducing an event to re-trigger reconciliation when bundles become available.
Changes:
- Add
BundlePendingto WAF policy state and set it when initial bundle fetch fails with no previous bundle. - Withhold Gateway config push when any targeting WAF policy is bundle-pending.
- Introduce
WAFBundleReconcileEventand have the WAF poller manager attempt to inject it on first successful bundle fetch.
Reviewed changes
Copilot reviewed 9 out of 9 changed files in this pull request and generated 4 comments.
Show a summary per file
| File | Description |
|---|---|
| internal/framework/waf/manager.go | Tracks bundle→policy ownership and attempts to inject WAFBundleReconcileEvent on first successful fetch. |
| internal/framework/waf/manager_test.go | Adds unit tests for reconcile-event injection behavior. |
| internal/framework/events/event.go | Adds the WAFBundleReconcileEvent type. |
| internal/controller/state/graph/policies.go | Introduces BundlePending on WAF state and sets pending condition on cold-miss fetch failure. |
| internal/controller/state/graph/policies_test.go | Updates expectations to reflect pending (fail-closed) behavior instead of invalidation. |
| internal/controller/state/conditions/conditions.go | Adds a new “bundle pending” programmed condition helper. |
| internal/controller/manager.go | Wires the main event loop channel into the WAF poller manager. |
| internal/controller/handler.go | Blocks config pushes for Gateways affected by pending WAF bundles; adds event handling case. |
| internal/controller/handler_test.go | Adds coverage for pending-gateway detection and event handling (panic guard). |
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
There was a problem hiding this comment.
Pull request overview
Copilot reviewed 9 out of 9 changed files in this pull request and generated 2 comments.
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
There was a problem hiding this comment.
Pull request overview
Copilot reviewed 12 out of 12 changed files in this pull request and generated 3 comments.
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
There was a problem hiding this comment.
Pull request overview
Copilot reviewed 12 out of 12 changed files in this pull request and generated 2 comments.
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
There was a problem hiding this comment.
Pull request overview
Copilot reviewed 12 out of 12 changed files in this pull request and generated 2 comments.
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
There was a problem hiding this comment.
Pull request overview
Copilot reviewed 12 out of 12 changed files in this pull request and generated no new comments.
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
250b129 to
91622b0
Compare
There was a problem hiding this comment.
Pull request overview
Copilot reviewed 12 out of 12 changed files in this pull request and generated 1 comment.
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com>
Proposed changes
Problem: On control plane restart, the in-memory WAF bundle cache is lost. If the initial bundle fetch fails after restart, the existing code sets policy.Valid=false and pushes NGINX config without WAF directives — silently bypassing WAF protection for traffic the operator explicitly configured to be protected.
Solution: Introduces a BundlePending state that withholds the Gateway config push entirely when a bundle has never been successfully fetched and all retries are exhausted, maintaining fail-closed posture. For policies with polling enabled, the WAF poller manager now holds a send-side reference to the main event loop channel and injects a WAFBundleReconcileEvent the first time it successfully fetches a previously-pending bundle, triggering an immediate re-reconcile and config push without operator intervention. Polling-disabled policies remain pending until the operator nudges a resource to trigger reconciliation. The existing RetryAttempts API field and stale-bundle fallback path (used when a previous bundle exists) are both unchanged.
Testing: Describe any testing that you did.
Please focus on (optional): If you any specific areas where you would like reviewers to focus their attention or provide
specific feedback, add them here.
Closes #ISSUE
Checklist
Before creating a PR, run through this checklist and mark each as complete.
Release notes
If this PR introduces a change that affects users and needs to be mentioned in the release notes,
please add a brief note that summarizes the change.