Skip to content

Support multi-cluster deployment via Cluster Inventory API #2264

@kahirokunn

Description

@kahirokunn

Problem

The Knative Operator can only install Knative on the local cluster where the operator itself is running. In multi-cluster environments, you need to install and manage the operator independently on each cluster, which makes centralized configuration management difficult.

The Cluster Inventory API (KEP-4322) from SIG-Multicluster has been making good progress, and ClusterProfile now provides a standardized way to obtain connection details for remote clusters via status.accessProviders. By building on this, we can add multi-cluster support without depending on any specific fleet manager.

Adoption of ClusterProfile across the multi-cluster ecosystem has also accelerated, and it is increasingly being treated as a first-class citizen for addressing remote clusters:

Alongside cluster-inventory-api v0.1.0, two official credential plugin implementations are now provided (secretreader and kubeconfig-secretreader), plus a kubelogin-based example demonstrating OIDC / SSO-backed access. This means a released API, official reference plugins, and real consumers are all in place, which are exactly the building blocks the Knative Operator would rely on.

A particularly compelling scenario this unlocks is air-gapped / private-network remote clusters. OCM's cluster-proxy provides a Konnectivity-style reverse tunnel established from the spoke side, so the hub never needs direct network reachability to the remote cluster's API server. Combined with a ClusterProfile access provider plugin that routes traffic through cluster-proxy, a single Knative Operator running on a management cluster could reconcile KnativeServing on remote clusters sitting behind NAT, in private VPCs, or in fully egress-only environments, with no inbound ports, bastion hosts, or VPN peering on the remote side, and no hand-maintained kubeconfig Secret.

Proposed API change: add clusterProfileRef to KnativeServingSpec:

type ClusterProfileReference struct {
    Name      string `json:"name"`
    Namespace string `json:"namespace"`
}

type KnativeServingSpec struct {
    // ... existing fields ...

    // ClusterProfileRef is an optional reference to a ClusterProfile resource
    // (multicluster.x-k8s.io/v1alpha1). When set, the operator reconciles
    // Knative Serving on the remote cluster described by the referenced
    // ClusterProfile instead of the local cluster.
    // +optional
    ClusterProfileRef *ClusterProfileReference `json:"clusterProfileRef,omitempty"`
}

Example CR:

apiVersion: operator.knative.dev/v1beta1
kind: KnativeServing
metadata:
  name: knative-serving-apac
  namespace: knative-serving
spec:
  version: "1.21"
  clusterProfileRef:
    name: apac-cluster-01
    namespace: fleet-system
  config:
    network:
      ingress-class: "kourier.ingress.networking.knative.dev"
  ingress:
    kourier:
      enabled: true

When clusterProfileRef is not set, no change: the operator reconciles on the local cluster as it does today. When set, the operator reads the referenced ClusterProfile, obtains connection details from status.accessProviders, and reconciles Knative Serving against that remote cluster. Because connection details are resolved via the standard access provider plugin model, reachability concerns (direct API access, OIDC/SSO, reverse tunnels such as cluster-proxy) are handled by the plugin rather than by the Knative Operator itself.

For building a rest.Config for the remote cluster, BuildConfigFromCP(clusterProfile) from the cluster-inventory-api pkg/credentials package can be used directly. The following examples are helpful:

  • controller-example main.go — shows the end-to-end flow of fetching a ClusterProfile, calling BuildConfigFromCP to get a remote rest.Config, and creating client-go / controller-runtime clients from it.
  • kubeconfig-secretreader plugin — a concrete demo that sets up hub/spoke kind clusters and authenticates via a Secret.

Persona:
Platform Provider / Cluster Operator

Exit Criteria

  • A KnativeServing CR with spec.clusterProfileRef set successfully deploys Knative Serving components on the referenced remote cluster.
  • A KnativeServing CR without spec.clusterProfileRef continues to work as before (no regression).

Time Estimate (optional):
~5 developer-days

Additional context (optional)

Metadata

Metadata

Assignees

No one assigned

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions