Skip to content

CNTRLPLANE-3213: Enable configurable PKI for managed certificate rotation#2958

Open
sanchezl wants to merge 2 commits intoopenshift:masterfrom
sanchezl:CNTRLPLANE-2014-configurable-pki
Open

CNTRLPLANE-3213: Enable configurable PKI for managed certificate rotation#2958
sanchezl wants to merge 2 commits intoopenshift:masterfrom
sanchezl:CNTRLPLANE-2014-configurable-pki

Conversation

@sanchezl
Copy link
Copy Markdown
Contributor

@sanchezl sanchezl commented Apr 10, 2026

Summary

Jira: CNTRLPLANE-3213

Wire up the ConfigurablePKI feature gate to the PKI controller's cert rotation, enabling cluster admins to configure certificate key algorithms (RSA/ECDSA) via the PKI CRD.

  • Set CertificateName and PKIProfileProvider on RotatedSigningCASecret and RotatedSelfSignedCertKeySecret for PKI profile resolution
  • Switch from ServingRotation + toClientCert to PeerRotation for target certificates (both client and server auth)
  • Register the PKI informer on the existing configInformers factory in operator.go and pass a ready-to-use PKIProfileProvider to the PKI controller
  • Make the signer controller algorithm-agnostic: remove hardcoded SHA512WithRSA, support PKCS#8/EC key formats

When ConfigurablePKI is disabled (the default), behavior is identical to before — nil PKIProfileProvider preserves legacy RSA-2048 behavior.

Certificate names

The PKI controller uses fixed certificate names for PKI profile resolution:

Type CertificateName
Signer (CA) network.signer
Target (peer) network.peer

Fixed names are used rather than per-resource names because the OperatorPKI CRD is namespaced and any resource can be created — using network.<name>-* would produce unpredictable names that wouldn't match PKI profile overrides. All OperatorPKI-managed certificates should resolve the same PKI profile.

Reviewer notes

ServingRotation → PeerRotation (legacy path): When PKIProfileProvider is nil (feature gate off), PeerRotation.NewCertificate receives nil keyGen and falls through to MakeServerCertForDuration with a peerExtFn that sets ExtKeyUsage: [ClientAuth, ServerAuth]. This is functionally identical to the old ServingRotation + toClientCert extension function, which appended ClientAuth to the existing [ServerAuth]. The only difference is ordering in the slice, which has no semantic meaning per X.509.

SHA512WithRSA → auto-detect: Removing the hardcoded SignatureAlgorithm: x509.SHA512WithRSA from the CSR signer's certificate template means Go's x509.CreateCertificate auto-selects the algorithm from the signing key type. For RSA keys this changes from SHA512WithRSA to SHA256WithRSA (Go's default). SHA256 is the industry standard; SHA512 was an unusual choice. Both are widely supported. This change is necessary to support ECDSA CAs (which cannot use RSA signature algorithms).

Removed debug leak: The old decodePrivateKey had a fmt.Println(block.Type) that printed to stdout in production when the PEM block type was unexpected. This is removed as part of the refactoring.

Informer lifecycle: The PKI informer is registered on the existing configInformers factory in operator.go, re-started (idempotent — only starts the newly registered PKI informer), and synced before controllers begin reconciling. The PKI controller receives a ready-to-use PKIProfileProvider with no informer management of its own.

References

Test plan

  • go build ./... — compiles clean
  • go vet ./... — clean
  • go test ./pkg/controller/signer/... — existing tests pass (validates algorithm-agnostic key decoding)
  • Unit tests pass (go test ./pkg/... ./cmd/...)
  • CI e2e (ConfigurablePKI disabled — default path, no behavioral change)
  • TechPreview e2e (ConfigurablePKI enabled — verifies PKI profile resolution)

Bump github.com/openshift/library-go to v0.0.0-20260409165127-c57da2bf5720
which includes configurable PKI support (pkg/pki package, CertificateName
and PKIProfileProvider fields on certrotation structs, PeerRotation type).
@openshift-ci-robot openshift-ci-robot added the jira/valid-reference Indicates that this PR references a valid Jira ticket of any type. label Apr 10, 2026
@openshift-ci-robot
Copy link
Copy Markdown
Contributor

openshift-ci-robot commented Apr 10, 2026

@sanchezl: This pull request references CNTRLPLANE-2014 which is a valid jira issue.

Warning: The referenced jira issue has an invalid target version for the target branch this PR targets: expected the story to target the "4.22.0" version, but no target version was set.

Details

In response to this:

Summary

Wire up the ConfigurablePKI feature gate to the PKI controller's cert rotation, enabling cluster admins to configure certificate key algorithms (RSA/ECDSA) via the PKI CRD.

  • Set CertificateName and PKIProfileProvider on RotatedSigningCASecret and RotatedSelfSignedCertKeySecret for PKI profile resolution
  • Switch from ServingRotation + toClientCert to PeerRotation for target certificates (both client and server auth)
  • Create a config informer factory in the PKI reconciler for the PKI lister when ConfigurablePKI is enabled
  • Make the signer controller algorithm-agnostic: remove hardcoded SHA512WithRSA, support PKCS#8/EC key formats

When ConfigurablePKI is disabled (the default), behavior is identical to before — nil PKIProfileProvider preserves legacy RSA-2048 behavior.

Certificate names

OperatorPKI Signer Target (peer)
ovn network.ovn-signer network.ovn-peer
network-node-identity network.network-node-identity-signer network.network-node-identity-peer

References

Test plan

  • go build ./... — compiles clean
  • go vet ./... — clean
  • go test ./pkg/controller/signer/... — existing tests pass (validates algorithm-agnostic key decoding)
  • Unit tests pass (go test ./pkg/... ./cmd/...)
  • CI e2e (ConfigurablePKI disabled — default path, no behavioral change)
  • TechPreview e2e (ConfigurablePKI enabled — verifies PKI profile resolution)

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the openshift-eng/jira-lifecycle-plugin repository.

@coderabbitai
Copy link
Copy Markdown

coderabbitai bot commented Apr 10, 2026

Walkthrough

PKI controller wiring changed to accept a PKI profile provider and is no longer registered via aggregated init; operator startup conditionally constructs and wires the profile provider and starts the PKI controller. Signer key parsing was generalized to return crypto.Signer and to support PKCS#1, PKCS#8, and EC keys. go.mod dependency version updated.

Changes

Cohort / File(s) Summary
Dependency
go.mod
Updated github.com/openshift/library-go version; moved k8s.io/apiserver v0.35.2 from indirect to direct require.
PKI controller wiring & reconciler
pkg/controller/pki/pki_controller.go, pkg/controller/add_networkconfig.go
Add(...) signature changed to accept pkiProfileProvider; reconciler/internal types renamed (pkioperatorPKI) and store the provider; per-PKI construction threads provider and sets explicit CertificateName values; switched peer rotation to certrotation.PeerRotation with UserInfo; removed pki.Add from aggregated init() registration.
Operator startup changes
pkg/operator/operator.go
When FeatureGateConfigurablePKI is enabled, registers PKIs informer on config informer factory, starts informer factory, waits for cache sync, constructs NewClusterPKIProfileProvider, and calls pkictrl.Add(...) with the provider after other controllers are added.
Signer key handling
pkg/controller/signer/signer.go
decodePrivateKey now returns crypto.Signer; PEM decoding adds explicit "no PEM block" error and handles RSA PRIVATE KEY (PKCS#1), PRIVATE KEY (PKCS#8), and EC PRIVATE KEY; removed hardcoded SignatureAlgorithm selection and RSA-only assumptions; eliminated unused helper/imports.

Estimated code review effort

🎯 4 (Complex) | ⏱️ ~50 minutes

✨ Finishing Touches
🧪 Generate unit tests (beta)
  • Create PR with unit tests

Warning

There were issues while running some tools. Please review the errors and either fix the tool's configuration or disable the tool if it's a critical failure.

🔧 golangci-lint (2.11.4)

level=error msg="Running error: context loading failed: failed to load packages: failed to load packages: failed to load with go/packages: err: exit status 1: stderr: go: inconsistent vendoring in :\n\tgithub.com/Masterminds/semver@v1.5.0: is explicitly required in go.mod, but not marked as explicit in vendor/modules.txt\n\tgithub.com/Masterminds/sprig/v3@v3.2.3: is explicitly required in go.mod, but not marked as explicit in vendor/modules.txt\n\tgithub.com/containernetworking/cni@v0.8.0: is explicitly required in go.mod, but not marked as explicit in vendor/modules.txt\n\tgithub.com/ghodss/yaml@v1.0.1-0.20190212211648-25d852aebe32: is explicitly required in go.mod, but not marked as explicit in vendor/modules.txt\n\tgithub.com/go-bindata/go-bindata@v3.1.2+incompatible: is explicitly required in go.mod, but not marked as explicit in vendor/modules.txt\n\tgithub.com/onsi/gomega@v1.39.1: is explicitly required in go.mod, but not marked as explicit in vendor/modules.txt\n\tgithub.com/ope

... [truncated 17356 characters] ...

ired in go.mod, but not marked as explicit in vendor/modules.txt\n\tk8s.io/gengo/v2@v2.0.0-20251215205346-5ee0d033ba5b: is explicitly required in go.mod, but not marked as explicit in vendor/modules.txt\n\tk8s.io/kms@v0.35.2: is explicitly required in go.mod, but not marked as explicit in vendor/modules.txt\n\tk8s.io/kube-aggregator@v0.35.1: is explicitly required in go.mod, but not marked as explicit in vendor/modules.txt\n\tsigs.k8s.io/randfill@v1.0.0: is explicitly required in go.mod, but not marked as explicit in vendor/modules.txt\n\tsigs.k8s.io/structured-merge-diff/v6@v6.3.2: is explicitly required in go.mod, but not marked as explicit in vendor/modules.txt\n\n\tTo ignore the vendor directory, use -mod=readonly or -mod=mod.\n\tTo sync the vendor directory, run:\n\t\tgo mod vendor\n"


Comment @coderabbitai help to get the list of available commands and usage tips.

@openshift-ci
Copy link
Copy Markdown
Contributor

openshift-ci bot commented Apr 10, 2026

[APPROVALNOTIFIER] This PR is NOT APPROVED

This pull-request has been approved by: sanchezl
Once this PR has been reviewed and has the lgtm label, please assign jacobtanenbaum for approval. For more information see the Code Review Process.

The full list of commands accepted by this bot can be found here.

Details Needs approval from an approver in each of these files:

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

@openshift-ci-robot
Copy link
Copy Markdown
Contributor

openshift-ci-robot commented Apr 10, 2026

@sanchezl: This pull request references CNTRLPLANE-2014 which is a valid jira issue.

Warning: The referenced jira issue has an invalid target version for the target branch this PR targets: expected the story to target the "4.22.0" version, but no target version was set.

Details

In response to this:

Summary

Jira: CNTRLPLANE-3213

Wire up the ConfigurablePKI feature gate to the PKI controller's cert rotation, enabling cluster admins to configure certificate key algorithms (RSA/ECDSA) via the PKI CRD.

  • Set CertificateName and PKIProfileProvider on RotatedSigningCASecret and RotatedSelfSignedCertKeySecret for PKI profile resolution
  • Switch from ServingRotation + toClientCert to PeerRotation for target certificates (both client and server auth)
  • Create a config informer factory in the PKI reconciler for the PKI lister when ConfigurablePKI is enabled
  • Make the signer controller algorithm-agnostic: remove hardcoded SHA512WithRSA, support PKCS#8/EC key formats

When ConfigurablePKI is disabled (the default), behavior is identical to before — nil PKIProfileProvider preserves legacy RSA-2048 behavior.

Certificate names

OperatorPKI Signer Target (peer)
ovn network.ovn-signer network.ovn-peer
network-node-identity network.network-node-identity-signer network.network-node-identity-peer

References

Test plan

  • go build ./... — compiles clean
  • go vet ./... — clean
  • go test ./pkg/controller/signer/... — existing tests pass (validates algorithm-agnostic key decoding)
  • Unit tests pass (go test ./pkg/... ./cmd/...)
  • CI e2e (ConfigurablePKI disabled — default path, no behavioral change)
  • TechPreview e2e (ConfigurablePKI enabled — verifies PKI profile resolution)

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the openshift-eng/jira-lifecycle-plugin repository.

@sanchezl sanchezl changed the title CNTRLPLANE-2014: Enable configurable PKI for managed certificate rotation CNTRLPLANE-3213: Enable configurable PKI for managed certificate rotation Apr 10, 2026
@openshift-ci-robot
Copy link
Copy Markdown
Contributor

openshift-ci-robot commented Apr 10, 2026

@sanchezl: This pull request references CNTRLPLANE-3213 which is a valid jira issue.

Warning: The referenced jira issue has an invalid target version for the target branch this PR targets: expected the sub-task to target the "4.22.0" version, but no target version was set.

Details

In response to this:

Summary

Jira: CNTRLPLANE-3213

Wire up the ConfigurablePKI feature gate to the PKI controller's cert rotation, enabling cluster admins to configure certificate key algorithms (RSA/ECDSA) via the PKI CRD.

  • Set CertificateName and PKIProfileProvider on RotatedSigningCASecret and RotatedSelfSignedCertKeySecret for PKI profile resolution
  • Switch from ServingRotation + toClientCert to PeerRotation for target certificates (both client and server auth)
  • Create a config informer factory in the PKI reconciler for the PKI lister when ConfigurablePKI is enabled
  • Make the signer controller algorithm-agnostic: remove hardcoded SHA512WithRSA, support PKCS#8/EC key formats

When ConfigurablePKI is disabled (the default), behavior is identical to before — nil PKIProfileProvider preserves legacy RSA-2048 behavior.

Certificate names

OperatorPKI Signer Target (peer)
ovn network.ovn-signer network.ovn-peer
network-node-identity network.network-node-identity-signer network.network-node-identity-peer

References

Test plan

  • go build ./... — compiles clean
  • go vet ./... — clean
  • go test ./pkg/controller/signer/... — existing tests pass (validates algorithm-agnostic key decoding)
  • Unit tests pass (go test ./pkg/... ./cmd/...)
  • CI e2e (ConfigurablePKI disabled — default path, no behavioral change)
  • TechPreview e2e (ConfigurablePKI enabled — verifies PKI profile resolution)

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the openshift-eng/jira-lifecycle-plugin repository.

Copy link
Copy Markdown

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 1

🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.

Inline comments:
In `@pkg/controller/pki/pki_controller.go`:
- Around line 111-121: Replace the private stopCh usage with wiring cfgInformers
to the manager lifecycle: create a manager Runnable (manager.Runnable or
manager.RunnableFunc) that uses the manager-provided context (ctx) to call
cfgInformers.Start(ctx.Done()) and then call cache.WaitForCacheSync(ctx.Done(),
cfgInformers.Config().V1alpha1().PKIs().Informer().HasSynced); if
WaitForCacheSync returns false, return a non-nil error so manager startup fails.
This removes the unclosed stopCh and ensures the informer is stopped with the
manager and startup fails when the PKI informer cache cannot sync.
🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

  • Push a commit to this branch (recommended)
  • Create a new PR with the fixes

ℹ️ Review info
⚙️ Run configuration

Configuration used: Organization UI

Review profile: CHILL

Plan: Pro

Run ID: 611125df-f583-43a6-aa9b-0ef3f96eee1d

📥 Commits

Reviewing files that changed from the base of the PR and between bc5af87 and dd6209d.

⛔ Files ignored due to path filters (14)
  • go.sum is excluded by !**/*.sum
  • vendor/github.com/openshift/library-go/pkg/crypto/cert_config.go is excluded by !vendor/**, !**/vendor/**
  • vendor/github.com/openshift/library-go/pkg/crypto/keygen.go is excluded by !vendor/**, !**/vendor/**
  • vendor/github.com/openshift/library-go/pkg/crypto/options.go is excluded by !vendor/**, !**/vendor/**
  • vendor/github.com/openshift/library-go/pkg/crypto/tls_adherence.go is excluded by !vendor/**, !**/vendor/**
  • vendor/github.com/openshift/library-go/pkg/operator/certrotation/client_cert_rotation_controller.go is excluded by !vendor/**, !**/vendor/**
  • vendor/github.com/openshift/library-go/pkg/operator/certrotation/signer.go is excluded by !vendor/**, !**/vendor/**
  • vendor/github.com/openshift/library-go/pkg/operator/certrotation/target.go is excluded by !vendor/**, !**/vendor/**
  • vendor/github.com/openshift/library-go/pkg/operator/v1helpers/helpers.go is excluded by !vendor/**, !**/vendor/**
  • vendor/github.com/openshift/library-go/pkg/pki/profile.go is excluded by !vendor/**, !**/vendor/**
  • vendor/github.com/openshift/library-go/pkg/pki/provider.go is excluded by !vendor/**, !**/vendor/**
  • vendor/github.com/openshift/library-go/pkg/pki/resolve.go is excluded by !vendor/**, !**/vendor/**
  • vendor/github.com/openshift/library-go/pkg/pki/types.go is excluded by !vendor/**, !**/vendor/**
  • vendor/modules.txt is excluded by !vendor/**, !**/vendor/**
📒 Files selected for processing (3)
  • go.mod
  • pkg/controller/pki/pki_controller.go
  • pkg/controller/signer/signer.go

@sanchezl sanchezl force-pushed the CNTRLPLANE-2014-configurable-pki branch from dd6209d to a61feb1 Compare April 10, 2026 05:04
@openshift-ci-robot
Copy link
Copy Markdown
Contributor

openshift-ci-robot commented Apr 10, 2026

@sanchezl: This pull request references CNTRLPLANE-3213 which is a valid jira issue.

Warning: The referenced jira issue has an invalid target version for the target branch this PR targets: expected the sub-task to target the "4.22.0" version, but no target version was set.

Details

In response to this:

Summary

Jira: CNTRLPLANE-3213

Wire up the ConfigurablePKI feature gate to the PKI controller's cert rotation, enabling cluster admins to configure certificate key algorithms (RSA/ECDSA) via the PKI CRD.

  • Set CertificateName and PKIProfileProvider on RotatedSigningCASecret and RotatedSelfSignedCertKeySecret for PKI profile resolution
  • Switch from ServingRotation + toClientCert to PeerRotation for target certificates (both client and server auth)
  • Register the PKI informer on the existing configInformers factory in operator.go and pass a ready-to-use PKIProfileProvider to the PKI controller
  • Make the signer controller algorithm-agnostic: remove hardcoded SHA512WithRSA, support PKCS#8/EC key formats

When ConfigurablePKI is disabled (the default), behavior is identical to before — nil PKIProfileProvider preserves legacy RSA-2048 behavior.

Certificate names

Certificate names are computed at runtime from the OperatorPKI resource name using network.<name>-signer and network.<name>-peer. The PKI controller dynamically creates a CertRotationController per OperatorPKI CR, so the names derive from the resource rather than being hardcoded. The current set of OperatorPKI resources produces:

OperatorPKI Signer Target (peer)
ovn network.ovn-signer network.ovn-peer
network-node-identity network.network-node-identity-signer network.network-node-identity-peer

References

Test plan

  • go build ./... — compiles clean
  • go vet ./... — clean
  • go test ./pkg/controller/signer/... — existing tests pass (validates algorithm-agnostic key decoding)
  • Unit tests pass (go test ./pkg/... ./cmd/...)
  • CI e2e (ConfigurablePKI disabled — default path, no behavioral change)
  • TechPreview e2e (ConfigurablePKI enabled — verifies PKI profile resolution)

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the openshift-eng/jira-lifecycle-plugin repository.

@sanchezl sanchezl force-pushed the CNTRLPLANE-2014-configurable-pki branch from a61feb1 to f0608f5 Compare April 10, 2026 05:13
@openshift-ci-robot
Copy link
Copy Markdown
Contributor

openshift-ci-robot commented Apr 10, 2026

@sanchezl: This pull request references CNTRLPLANE-3213 which is a valid jira issue.

Warning: The referenced jira issue has an invalid target version for the target branch this PR targets: expected the sub-task to target the "4.22.0" version, but no target version was set.

Details

In response to this:

Summary

Jira: CNTRLPLANE-3213

Wire up the ConfigurablePKI feature gate to the PKI controller's cert rotation, enabling cluster admins to configure certificate key algorithms (RSA/ECDSA) via the PKI CRD.

  • Set CertificateName and PKIProfileProvider on RotatedSigningCASecret and RotatedSelfSignedCertKeySecret for PKI profile resolution
  • Switch from ServingRotation + toClientCert to PeerRotation for target certificates (both client and server auth)
  • Register the PKI informer on the existing configInformers factory in operator.go and pass a ready-to-use PKIProfileProvider to the PKI controller
  • Make the signer controller algorithm-agnostic: remove hardcoded SHA512WithRSA, support PKCS#8/EC key formats

When ConfigurablePKI is disabled (the default), behavior is identical to before — nil PKIProfileProvider preserves legacy RSA-2048 behavior.

Certificate names

The PKI controller uses fixed certificate names for PKI profile resolution:

Type CertificateName
Signer (CA) network.signer
Target (peer) network.peer

Fixed names are used rather than per-resource names because the OperatorPKI CRD is namespaced and any resource can be created — using network.<name>-* would produce unpredictable names that wouldn't match PKI profile overrides. All OperatorPKI-managed certificates should resolve the same PKI profile.

References

Test plan

  • go build ./... — compiles clean
  • go vet ./... — clean
  • go test ./pkg/controller/signer/... — existing tests pass (validates algorithm-agnostic key decoding)
  • Unit tests pass (go test ./pkg/... ./cmd/...)
  • CI e2e (ConfigurablePKI disabled — default path, no behavioral change)
  • TechPreview e2e (ConfigurablePKI enabled — verifies PKI profile resolution)

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the openshift-eng/jira-lifecycle-plugin repository.

Copy link
Copy Markdown

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 1

🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.

Inline comments:
In `@pkg/operator/operator.go`:
- Around line 146-158: The PKI informer sync uses wait.NeverStop which can hang
startup; change it to use a bounded stop channel with a timeout (matching the
feature-gate init pattern used earlier) so the
configInformers.Start/WaitForCacheSync call fails fast on timeout. Concretely,
create a timeout channel (e.g., stopCh := make(chan struct{}); start a goroutine
that closes stopCh after the same timeout duration used earlier, pass stopCh to
configInformers.Start and configInformers.WaitForCacheSync, and return an error
if any informer (from the map returned by configInformers.WaitForCacheSync)
failed to sync before the timeout; keep the logic around
featureGates.Enabled(features.FeatureGateConfigurablePKI) and the existing error
message in fmt.Errorf.
🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

  • Push a commit to this branch (recommended)
  • Create a new PR with the fixes

ℹ️ Review info
⚙️ Run configuration

Configuration used: Organization UI

Review profile: CHILL

Plan: Pro

Run ID: cd98c2a0-e043-44a2-ad01-106115a7798d

📥 Commits

Reviewing files that changed from the base of the PR and between dd6209d and a61feb1.

📒 Files selected for processing (4)
  • pkg/controller/add_networkconfig.go
  • pkg/controller/pki/pki_controller.go
  • pkg/controller/signer/signer.go
  • pkg/operator/operator.go
🚧 Files skipped from review as they are similar to previous changes (1)
  • pkg/controller/pki/pki_controller.go

Comment on lines +146 to +158
// Set up PKI profile provider if ConfigurablePKI is enabled.
// Register the PKI informer on the existing configInformers factory,
// re-start to pick it up (idempotent for already-running informers),
// and wait for cache sync before controllers start reconciling.
var pkiProfileProvider pkipkg.PKIProfileProvider
if featureGates.Enabled(features.FeatureGateConfigurablePKI) {
configInformers.Config().V1alpha1().PKIs().Informer()
configInformers.Start(wait.NeverStop)
for t, synced := range configInformers.WaitForCacheSync(wait.NeverStop) {
if !synced {
return fmt.Errorf("failed to sync config informer for %v", t)
}
}
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟠 Major

🧩 Analysis chain

🏁 Script executed:

# Read the relevant section of the operator.go file
cat -n pkg/operator/operator.go | sed -n '140,170p'

Repository: openshift/cluster-network-operator

Length of output: 1500


🏁 Script executed:

# Check imports to understand available context/wait utilities
head -n 50 pkg/operator/operator.go | cat -n

Repository: openshift/cluster-network-operator

Length of output: 2361


🏁 Script executed:

# Search for other WaitForCacheSync usages in the codebase to understand patterns
rg 'WaitForCacheSync' -A 3 -B 1

Repository: openshift/cluster-network-operator

Length of output: 37944


🏁 Script executed:

# Look at the broader function context around lines 146-158 to check available variables
cat -n pkg/operator/operator.go | sed -n '100,180p'

Repository: openshift/cluster-network-operator

Length of output: 3832


🏁 Script executed:

# Find the function signature that contains this code block
sed -n '50,100p' pkg/operator/operator.go | cat -n

Repository: openshift/cluster-network-operator

Length of output: 2544


🏁 Script executed:

# Search for the function name containing line 146-162
rg -B 50 'Set up PKI profile provider if ConfigurablePKI' pkg/operator/operator.go | head -70

Repository: openshift/cluster-network-operator

Length of output: 1808


Bound the PKI informer sync wait.

Line 154 uses WaitForCacheSync(wait.NeverStop), which can hang operator startup indefinitely if the PKIs informer never syncs. This turns configuration or RBAC issues into a permanently hung operator rather than fast failure. Apply a timeout consistent with the feature gate initialization pattern used earlier (line 137).

Suggested fix
 	if featureGates.Enabled(features.FeatureGateConfigurablePKI) {
 		configInformers.Config().V1alpha1().PKIs().Informer()
 		configInformers.Start(wait.NeverStop)
-		for t, synced := range configInformers.WaitForCacheSync(wait.NeverStop) {
+		syncCtx, cancel := context.WithTimeout(ctx, time.Minute)
+		defer cancel()
+		for t, synced := range configInformers.WaitForCacheSync(syncCtx.Done()) {
 			if !synced {
-				return fmt.Errorf("failed to sync config informer for %v", t)
+				return fmt.Errorf("failed to sync config informer for %v before timeout", t)
 			}
 		}
 		pkiProfileProvider = pkipkg.NewClusterPKIProfileProvider(
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@pkg/operator/operator.go` around lines 146 - 158, The PKI informer sync uses
wait.NeverStop which can hang startup; change it to use a bounded stop channel
with a timeout (matching the feature-gate init pattern used earlier) so the
configInformers.Start/WaitForCacheSync call fails fast on timeout. Concretely,
create a timeout channel (e.g., stopCh := make(chan struct{}); start a goroutine
that closes stopCh after the same timeout duration used earlier, pass stopCh to
configInformers.Start and configInformers.WaitForCacheSync, and return an error
if any informer (from the map returned by configInformers.WaitForCacheSync)
failed to sync before the timeout; keep the logic around
featureGates.Enabled(features.FeatureGateConfigurablePKI) and the existing error
message in fmt.Errorf.

@openshift-ci-robot
Copy link
Copy Markdown
Contributor

openshift-ci-robot commented Apr 10, 2026

@sanchezl: This pull request references CNTRLPLANE-3213 which is a valid jira issue.

Warning: The referenced jira issue has an invalid target version for the target branch this PR targets: expected the sub-task to target the "4.22.0" version, but no target version was set.

Details

In response to this:

Summary

Jira: CNTRLPLANE-3213

Wire up the ConfigurablePKI feature gate to the PKI controller's cert rotation, enabling cluster admins to configure certificate key algorithms (RSA/ECDSA) via the PKI CRD.

  • Set CertificateName and PKIProfileProvider on RotatedSigningCASecret and RotatedSelfSignedCertKeySecret for PKI profile resolution
  • Switch from ServingRotation + toClientCert to PeerRotation for target certificates (both client and server auth)
  • Register the PKI informer on the existing configInformers factory in operator.go and pass a ready-to-use PKIProfileProvider to the PKI controller
  • Make the signer controller algorithm-agnostic: remove hardcoded SHA512WithRSA, support PKCS#8/EC key formats

When ConfigurablePKI is disabled (the default), behavior is identical to before — nil PKIProfileProvider preserves legacy RSA-2048 behavior.

Certificate names

The PKI controller uses fixed certificate names for PKI profile resolution:

Type CertificateName
Signer (CA) network.signer
Target (peer) network.peer

Fixed names are used rather than per-resource names because the OperatorPKI CRD is namespaced and any resource can be created — using network.<name>-* would produce unpredictable names that wouldn't match PKI profile overrides. All OperatorPKI-managed certificates should resolve the same PKI profile.

Reviewer notes

ServingRotation → PeerRotation (legacy path): When PKIProfileProvider is nil (feature gate off), PeerRotation.NewCertificate receives nil keyGen and falls through to MakeServerCertForDuration with a peerExtFn that sets ExtKeyUsage: [ClientAuth, ServerAuth]. This is functionally identical to the old ServingRotation + toClientCert extension function, which appended ClientAuth to the existing [ServerAuth]. The only difference is ordering in the slice, which has no semantic meaning per X.509.

SHA512WithRSA → auto-detect: Removing the hardcoded SignatureAlgorithm: x509.SHA512WithRSA from the CSR signer's certificate template means Go's x509.CreateCertificate auto-selects the algorithm from the signing key type. For RSA keys this changes from SHA512WithRSA to SHA256WithRSA (Go's default). SHA256 is the industry standard; SHA512 was an unusual choice. Both are widely supported. This change is necessary to support ECDSA CAs (which cannot use RSA signature algorithms).

Removed debug leak: The old decodePrivateKey had a fmt.Println(block.Type) that printed to stdout in production when the PEM block type was unexpected. This is removed as part of the refactoring.

Informer lifecycle: The PKI informer is registered on the existing configInformers factory in operator.go, re-started (idempotent — only starts the newly registered PKI informer), and synced before controllers begin reconciling. The PKI controller receives a ready-to-use PKIProfileProvider with no informer management of its own.

References

Test plan

  • go build ./... — compiles clean
  • go vet ./... — clean
  • go test ./pkg/controller/signer/... — existing tests pass (validates algorithm-agnostic key decoding)
  • Unit tests pass (go test ./pkg/... ./cmd/...)
  • CI e2e (ConfigurablePKI disabled — default path, no behavioral change)
  • TechPreview e2e (ConfigurablePKI enabled — verifies PKI profile resolution)

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the openshift-eng/jira-lifecycle-plugin repository.

…tion

Wire up the ConfigurablePKI feature gate to the PKI controller's cert
rotation, enabling cluster admins to configure certificate key algorithms
via the PKI CRD when the gate is enabled.

Changes:
- Set CertificateName and PKIProfileProvider on RotatedSigningCASecret and
  RotatedSelfSignedCertKeySecret to enable PKI profile resolution
- Switch from ServingRotation+toClientCert to PeerRotation for target
  certificates that need both client and server authentication
- Create a config informer factory in the PKI reconciler to provide the
  PKI lister when ConfigurablePKI is enabled
- Make the signer controller algorithm-agnostic: remove hardcoded
  SHA512WithRSA signature algorithm (auto-detected from signing key) and
  support PKCS#8/EC private key formats alongside legacy PKCS#1 RSA

When ConfigurablePKI is disabled (the default), behavior is identical to
before — nil PKIProfileProvider preserves legacy RSA-2048 behavior.
@sanchezl sanchezl force-pushed the CNTRLPLANE-2014-configurable-pki branch from f0608f5 to 75e8e1d Compare April 10, 2026 05:34
Copy link
Copy Markdown

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 1

🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.

Inline comments:
In `@pkg/controller/pki/pki_controller.go`:
- Around line 216-217: CertificateName is being hardcoded to
"network.signer"/"network.peer" which forces all OperatorPKI resources to use
the same profile; instead compute CertificateName from the OperatorPKI resource
name (e.g., use the OperatorPKI object's Name to produce "<resourceName>.signer"
or "<resourceName>.peer") when constructing the PKI profile (refer to the
CertificateName field assignment alongside PKIProfileProvider in the code that
builds the profile for OperatorPKI). Update both places where
"network.signer"/"network.peer" are set so they derive from the OperatorPKI
resource name rather than using the literal strings.
🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

  • Push a commit to this branch (recommended)
  • Create a new PR with the fixes

ℹ️ Review info
⚙️ Run configuration

Configuration used: Organization UI

Review profile: CHILL

Plan: Pro

Run ID: 6c9459cf-1078-4daf-a74c-ff5e3fe5f630

📥 Commits

Reviewing files that changed from the base of the PR and between f0608f5 and 75e8e1d.

📒 Files selected for processing (5)
  • go.mod
  • pkg/controller/add_networkconfig.go
  • pkg/controller/pki/pki_controller.go
  • pkg/controller/signer/signer.go
  • pkg/operator/operator.go
🚧 Files skipped from review as they are similar to previous changes (3)
  • pkg/controller/add_networkconfig.go
  • pkg/operator/operator.go
  • go.mod

Comment on lines +216 to +217
CertificateName: "network.signer",
PKIProfileProvider: pkiProfileProvider,
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟠 Major

Compute CertificateName from the OperatorPKI resource name.

Hardcoding "network.signer" and "network.peer" makes every OperatorPKI resolve the same PKI profile. That breaks per-resource algorithm selection for resources like ovn vs network-node-identity.

Proposed fix
-			CertificateName:    "network.signer",
+			CertificateName:    fmt.Sprintf("network.%s-signer", config.Name),
 			PKIProfileProvider: pkiProfileProvider,
@@
-			CertificateName:    "network.peer",
+			CertificateName:    fmt.Sprintf("network.%s-peer", config.Name),
 			PKIProfileProvider: pkiProfileProvider,

Also applies to: 242-243

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@pkg/controller/pki/pki_controller.go` around lines 216 - 217, CertificateName
is being hardcoded to "network.signer"/"network.peer" which forces all
OperatorPKI resources to use the same profile; instead compute CertificateName
from the OperatorPKI resource name (e.g., use the OperatorPKI object's Name to
produce "<resourceName>.signer" or "<resourceName>.peer") when constructing the
PKI profile (refer to the CertificateName field assignment alongside
PKIProfileProvider in the code that builds the profile for OperatorPKI). Update
both places where "network.signer"/"network.peer" are set so they derive from
the OperatorPKI resource name rather than using the literal strings.

@sanchezl
Copy link
Copy Markdown
Contributor Author

/retest-required

@sanchezl
Copy link
Copy Markdown
Contributor Author

TechPreview e2e: ECDSA certificates confirmed

The e2e-aws-ovn-rhcos10-techpreview job ran with ConfigurablePKI enabled. Inspecting the gather-extra artifacts confirms all OperatorPKI-managed certificates were created as ECDSA P-256:

Namespace Secret Type Algorithm Validity
openshift-ovn-kubernetes ovn-ca signer ECDSA P-256 10 years
openshift-ovn-kubernetes ovn-cert target (peer) ECDSA P-256 6 months
openshift-ovn-kubernetes signer-ca signer ECDSA P-256 10 years
openshift-network-node-identity network-node-identity-ca signer ECDSA P-256 10 years
openshift-network-node-identity network-node-identity-cert target (peer) ECDSA P-256 6 months

This validates:

  • PKIProfileProvider resolved the default TechPreview PKI profile (ECDSA P-256)
  • PeerRotation created target certs with both ClientAuth and ServerAuth EKU
  • The signer controller's algorithm-agnostic decodePrivateKey successfully parsed the ECDSA CA keys
  • The cluster bootstrapped and ran e2e tests with these certificates

Source: gather-extra/artifacts/inspect

@sanchezl
Copy link
Copy Markdown
Contributor Author

/retest-required

@openshift-ci
Copy link
Copy Markdown
Contributor

openshift-ci bot commented Apr 12, 2026

@sanchezl: The following tests failed, say /retest to rerun all failed tests or /retest-required to rerun all mandatory failed tests:

Test name Commit Details Required Rerun command
ci/prow/e2e-aws-ovn-rhcos10-techpreview 75e8e1d link false /test e2e-aws-ovn-rhcos10-techpreview
ci/prow/4.22-upgrade-from-stable-4.21-e2e-aws-ovn-upgrade 75e8e1d link false /test 4.22-upgrade-from-stable-4.21-e2e-aws-ovn-upgrade
ci/prow/security 75e8e1d link false /test security
ci/prow/4.22-upgrade-from-stable-4.21-e2e-azure-ovn-upgrade 75e8e1d link false /test 4.22-upgrade-from-stable-4.21-e2e-azure-ovn-upgrade
ci/prow/4.22-upgrade-from-stable-4.21-e2e-gcp-ovn-upgrade 75e8e1d link false /test 4.22-upgrade-from-stable-4.21-e2e-gcp-ovn-upgrade
ci/prow/e2e-aws-ovn-hypershift-conformance 75e8e1d link true /test e2e-aws-ovn-hypershift-conformance

Full PR test history. Your PR dashboard.

Details

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes-sigs/prow repository. I understand the commands that are listed here.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

jira/valid-reference Indicates that this PR references a valid Jira ticket of any type.

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants