feat(hetzner): generate HCLOUD_CLUSTER_CONFIG for cluster-autoscaler addon by bjornharrtell · Pull Request #18137 · kubernetes/kops

bjornharrtell · 2026-03-30T16:35:48Z

Summary

Partial implementation of #18136. This draft establishes the overall approach for generating HCLOUD_CLUSTER_CONFIG and wiring it through the cluster-autoscaler addon for Hetzner.

Builds on #18135 (HCLOUD_TOKEN and --nodes format fixes).

Requires kubernetes/autoscaler#9430 to be merged for the labels to be applied to autoscaler-created servers.

What this PR does

1. `HetznerClusterAutoscalerConfig()` template function

A new template function in template_functions.go that generates the base64-encoded JSON blob for HCLOUD_CLUSTER_CONFIG. For each autoscalable node instance group it produces a NodeConfig entry containing:

labels: The full set of Hetzner server labels computed by CloudTagsForInstanceGroup() — the same labels that kops stamps on servers it creates directly. With autoscaler PR feat(hetzner): add serverLabels field to nodeConfig for Hetzner server labels autoscaler#9430, these are applied to autoscaler-created servers at creation time, so kops cloud instance group reconciliation correctly counts them.
imagesForArch: The Hetzner image name from ig.Spec.Image.
cloudInit: Intentionally empty (see below).

2. `hcloud-autoscaler-config` Secret

A new Secret resource added to the cluster-autoscaler addon template (Hetzner only), populated with the output of the template function above.

3. `HCLOUD_CLUSTER_CONFIG` env var

Added to the autoscaler container's env block, sourced from the new secret, alongside the existing HCLOUD_TOKEN and HCLOUD_NETWORK.

What is still missing — cloud-init generation

The NodeConfig.CloudInit field is left empty in this implementation. This means autoscaler-created nodes will have the correct Hetzner labels but will not bootstrap into the cluster. Completing the implementation requires generating the nodeup bootstrap shell script for each node IG.

This is blocked by the fact that addon-template rendering happens before task execution: the CA keypairs, nodeup binary asset URLs, and NodeupConfigHash are not yet available when TemplateFunctions methods are called.

Two paths to resolve this:

Option A — thread through TemplateFunctions (simpler, focused change):
Pass fi.KeystoreReader, NodeUpAssets map[architectures.Architecture]*assets.MirroredAsset, and NodeUpConfigBuilder into TemplateFunctions at construction time in apply_cluster.go. The NodeupConfigHash can be computed by running the config builder for each IG before tasks are scheduled.

Option B — dedicated post-build task (more correct architecture):
Introduce a HetznerClusterAutoscalerConfigSecret task in hetznertasks/ that declares dependencies on each node IG's BootstrapScript task. After those tasks run, it reads the rendered cloud-init via fi.ResourceAsString(), assembles the ClusterConfig JSON, and creates or updates the hcloud-autoscaler-config Secret via the Kubernetes API. The addon template would still reference the secret; the task guarantees it is populated before the autoscaler deployment starts.

Feedback on which option to pursue would be welcome before completing this draft.

Testing

Partially verified against a live kops 1.35 Hetzner cluster: kops update cluster renders the addon template without error and produces the hcloud-autoscaler-config secret with the expected JSON (correct labels, image name). Full end-to-end test (autoscaler creating nodes that join the cluster) is blocked on completing cloud-init generation.

Two fixes to make the kops-managed cluster-autoscaler addon work correctly on Hetzner: 1. Pass HCLOUD_TOKEN and HCLOUD_NETWORK env vars to the autoscaler pod. The addon template only had an env block for AWS (AWS_REGION); without the Hetzner token the autoscaler cannot authenticate and fails immediately on startup. The vars are sourced from the existing 'hcloud' secret in kube-system, which is already created by the CCM addon. 2. Fix the --nodes flag format. GetClusterAutoscalerNodeGroups() was producing the generic '<name>.<cluster>' suffix for all non-GCE providers, giving a 3-field format (min:max:name.cluster) that the Hetzner cloud provider does not recognise. Hetzner requires 5 fields: min:max:instanceType:region:name. The region argument is the Hetzner location name, which equals the subnet name stored in ig.Spec.Subnets[0] (e.g. 'hel1').

…addon Add a HetznerClusterAutoscalerConfig template function that builds the HCLOUD_CLUSTER_CONFIG JSON blob expected by the Hetzner cluster-autoscaler cloud provider (ClusterConfig struct in hetzner_manager.go). The config encodes per-node-group entries (NodeConfig) containing the same Hetzner server labels that kops applies to servers it provisions directly. With autoscaler PR kubernetes/autoscaler#9430 in place, these labels are stamped onto autoscaler-created servers at creation time, so kops cloud instance group reconciliation correctly counts them. A new hcloud-autoscaler-config Secret is added to the cluster-autoscaler addon manifest (Hetzner only). HCLOUD_CLUSTER_CONFIG is wired into the autoscaler deployment from this secret alongside the existing HCLOUD_TOKEN and HCLOUD_NETWORK vars. The NodeConfig.CloudInit field is intentionally left empty in this draft: generating the nodeup bootstrap script requires CA keypairs and node-up binary asset URLs that are not yet accessible at addon-template render time. This means autoscaler-created nodes will have the correct labels but will not bootstrap correctly until cloud-init generation is completed. The follow-up requires either threading the keystore and NodeUpAssets through TemplateFunctions or implementing a dedicated post-build task.

k8s-ci-robot · 2026-03-30T16:35:57Z

[APPROVALNOTIFIER] This PR is NOT APPROVED

This pull-request has been approved by:
Once this PR has been reviewed and has the lgtm label, please assign johngmyers for approval. For more information see the Code Review Process.

The full list of commands accepted by this bot can be found here.

Details

Needs approval from an approver in each of these files:

OWNERS

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

k8s-ci-robot · 2026-03-30T16:36:00Z

Hi @bjornharrtell. Thanks for your PR.

I'm waiting for a kubernetes member to verify that this patch is reasonable to test. If it is, they should reply with /ok-to-test on its own line. Until that is done, I will not automatically test new commits in this PR, but the usual testing commands by org members will still work.

Regular contributors should join the org to skip this step.

Once the patch is verified, the new status will be reflected by the ok-to-test label.

I understand the commands that are listed here.

Details

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes-sigs/prow repository.

bjornharrtell · 2026-03-30T16:37:47Z

This is in draft because it depends on both kubernetes/autoscaler#9430 and #18135.

Also not sure about the options on how to solve the bootstrap script issue.

bjornharrtell added 2 commits March 30, 2026 18:09

k8s-ci-robot added the do-not-merge/work-in-progress Indicates that a PR should not merge because it is a work in progress. label Mar 30, 2026

k8s-ci-robot requested a review from johngmyers March 30, 2026 16:35

k8s-ci-robot added the cncf-cla: yes Indicates the PR's author has signed the CNCF CLA. label Mar 30, 2026

k8s-ci-robot requested a review from olemarkus March 30, 2026 16:35

k8s-ci-robot added area/addons size/L Denotes a PR that changes 100-499 lines, ignoring generated files. needs-ok-to-test Indicates a PR that requires an org member to verify it is safe to test. labels Mar 30, 2026

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

feat(hetzner): generate HCLOUD_CLUSTER_CONFIG for cluster-autoscaler addon#18137

feat(hetzner): generate HCLOUD_CLUSTER_CONFIG for cluster-autoscaler addon#18137
bjornharrtell wants to merge 2 commits intokubernetes:masterfrom
bjornharrtell:hetzner-cluster-autoscaler-config

bjornharrtell commented Mar 30, 2026

Uh oh!

k8s-ci-robot commented Mar 30, 2026

Uh oh!

k8s-ci-robot commented Mar 30, 2026

Uh oh!

bjornharrtell commented Mar 30, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Conversation

bjornharrtell commented Mar 30, 2026

Summary

What this PR does

1. HetznerClusterAutoscalerConfig() template function

2. hcloud-autoscaler-config Secret

3. HCLOUD_CLUSTER_CONFIG env var

What is still missing — cloud-init generation

Testing

Uh oh!

k8s-ci-robot commented Mar 30, 2026

Uh oh!

k8s-ci-robot commented Mar 30, 2026

Uh oh!

bjornharrtell commented Mar 30, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

1. `HetznerClusterAutoscalerConfig()` template function

2. `hcloud-autoscaler-config` Secret

3. `HCLOUD_CLUSTER_CONFIG` env var