diff --git a/README.md b/README.md index f807750..2ca1793 100644 --- a/README.md +++ b/README.md @@ -45,6 +45,8 @@ spec: > **Note:** `spec.targetResource` and `spec.schedule` (cron and duration) are **immutable** after creation. To change the target or schedule, delete the `SpannerAutoscaleSchedule` and create a new one. Only `spec.additionalProcessingUnits` can be updated in place. +> **Note:** When multiple schedules are active simultaneously (i.e. their windows overlap), the `additionalProcessingUnits` from all active schedules are **summed** and added to both `desiredMinPUs` and `desiredMaxPUs`. For example, if schedule A adds +1,000 PU and schedule B adds +5,000 PU and both are active at the same time, `desiredMinPUs = spec.processingUnits.min + 6,000`. + ## Installation Spanner Autoscaler can be installed using [KPT](https://kpt.dev/installation/) by following 2 steps: @@ -110,6 +112,28 @@ spec: highPriority: 60 ``` +#### Using total CPU utilization as scaling target: + +```yaml +apiVersion: spanner.mercari.com/v1beta1 +kind: SpannerAutoscaler +metadata: + name: spannerautoscaler-sample + namespace: your-namespace +spec: + targetInstance: + projectId: your-gcp-project-id + instanceId: your-spanner-instance-id + scaleConfig: + processingUnits: + min: 1000 + max: 10000 + targetCPUUtilization: + total: 65 +``` + +> **Note:** `highPriority` and `total` are mutually exclusive — exactly one must be specified. They use different Cloud Monitoring metrics (`spanner.googleapis.com/instance/cpu/utilization_by_priority` and `spanner.googleapis.com/instance/cpu/utilization` respectively) and the values are not directly comparable. When switching between them on a live resource, no scaling occurs during the first reconcile after the change because the status for the new metric type has not yet been populated by the syncer. Scaling resumes normally on the next sync cycle (default: 1 minute). + #### Single Service Account using Workload Identity: ```yaml @@ -183,8 +207,6 @@ spec: > **Note:** `spec.targetInstance` (`projectId` and `instanceId`) is **immutable** after creation. To change the target Spanner instance, delete the `SpannerAutoscaler` and create a new one. -> **Note:** `highPriority` and `total` in `targetCPUUtilization` are mutually exclusive — exactly one must be specified. They use different Cloud Monitoring metrics (`spanner.googleapis.com/instance/cpu/utilization_by_priority` and `spanner.googleapis.com/instance/cpu/utilization` respectively), so the values are not directly comparable. When switching between them on a live resource, no scaling occurs during the first reconcile after the change because the status for the new metric type has not yet been populated by the syncer. Scaling resumes normally on the next sync cycle (default: 1 minute). - ## GCP Setup On your GCP project, you will need to enable `spanner.googleapis.com` and `monitoring.googleapis.com` APIs. diff --git a/api/v1beta1/spannerautoscaleschedule_types.go b/api/v1beta1/spannerautoscaleschedule_types.go index 7eda6a2..a9c3257 100644 --- a/api/v1beta1/spannerautoscaleschedule_types.go +++ b/api/v1beta1/spannerautoscaleschedule_types.go @@ -31,13 +31,16 @@ type Schedule struct { // SpannerAutoscaleScheduleSpec defines the desired state of SpannerAutoscaleSchedule type SpannerAutoscaleScheduleSpec struct { - // The `SpannerAutoscaler` resource name with which this schedule will be registered + // The `SpannerAutoscaler` resource name with which this schedule will be registered. + // Immutable after creation. TargetResource string `json:"targetResource"` - // The extra compute capacity which will be added when this schedule is active + // The extra compute capacity which will be added when this schedule is active. + // This is the only field that can be updated after creation. AdditionalProcessingUnits int `json:"additionalProcessingUnits"` - // The details of when and for how long this schedule will be active + // The details of when and for how long this schedule will be active. + // Immutable after creation. Schedule Schedule `json:"schedule"` } diff --git a/docs/crd-reference.md b/docs/crd-reference.md index f4379b8..803f920 100644 --- a/docs/crd-reference.md +++ b/docs/crd-reference.md @@ -69,6 +69,25 @@ _Appears in:_ | `iamKeySecret` _[IAMKeySecret](#iamkeysecret)_ | Details of the k8s secret which contains the GCP service account authentication key (in JSON).
[[Ref](https://cloud.google.com/kubernetes-engine/docs/tutorials/authenticating-to-cloud-platform)].
This is a pointer because structs with string slices can not be compared for zero values | | | +#### CPUMetricType + +_Underlying type:_ _string_ + +CPUMetricType identifies which Cloud Monitoring CPU metric is being used +for autoscaling decisions. + +_Validation:_ +- Enum: [HighPriority Total] + +_Appears in:_ +- [SpannerAutoscalerStatus](#spannerautoscalerstatus) + +| Field | Description | +| --- | --- | +| `HighPriority` | CPUMetricTypeHighPriority uses spanner.googleapis.com/instance/cpu/utilization_by_priority
with priority=high filter.
| +| `Total` | CPUMetricTypeTotal uses spanner.googleapis.com/instance/cpu/utilization (all priorities).
| + + #### ComputeType _Underlying type:_ _string_ @@ -158,7 +177,7 @@ _Appears in:_ | `processingUnits` _[ScaleConfigPUs](#scaleconfigpus)_ | ProcessingUnits for scaling of the Spanner instance. Ref: [Spanner Compute Capacity](https://cloud.google.com/spanner/docs/compute-capacity#compute_capacity) | | | | `scaledownStepSize` _[IntOrString](https://kubernetes.io/docs/reference/generated/kubernetes-api/v1.22/#intorstring-intstr-util)_ | The maximum number of processing units which can be deleted in one scale-down operation. It can be a multiple of 100 for values < 1000, or a multiple of 1000 otherwise.
It can also be a percentage of the total number of processing units at the start of the scale-down operation. | 2000 | | | `scaledownInterval` _[Duration](https://kubernetes.io/docs/reference/generated/kubernetes-api/v1.22/#duration-v1-meta)_ | How often autoscaler is reevaluated for scale down.
The cool down period between two consecutive scaledown operations. If this option is omitted, the value of the `--scale-down-interval` command line option is taken as the default value. | | | -| `scaleupStepSize` _[IntOrString](https://kubernetes.io/docs/reference/generated/kubernetes-api/v1.22/#intorstring-intstr-util)_ | The maximum number of processing units which can be added in one scale-up operation. It can be a multiple of 100 for values < 1000, or a multiple of 1000 otherwise.
It can also be a percentage of the total number of processing units at the start of the scale-down operation. | 0 | | +| `scaleupStepSize` _[IntOrString](https://kubernetes.io/docs/reference/generated/kubernetes-api/v1.22/#intorstring-intstr-util)_ | The maximum number of processing units which can be added in one scale-up operation. It can be a multiple of 100 for values < 1000, or a multiple of 1000 otherwise.
It can also be a percentage of the total number of processing units at the start of the scale-up operation. | 0 | | | `scaleupInterval` _[Duration](https://kubernetes.io/docs/reference/generated/kubernetes-api/v1.22/#duration-v1-meta)_ | How often autoscaler is reevaluated for scale up.
The warm up period between two consecutive scaleup operations. If this option is omitted, the value of the `--scale-up-interval` command line option is taken as the default value. | | | | `targetCPUUtilization` _[TargetCPUUtilization](#targetcpuutilization)_ | The CPU utilization which the autoscaling will try to achieve. Ref: [Spanner CPU utilization](https://cloud.google.com/spanner/docs/cpu-utilization#task-priority) | | | @@ -246,9 +265,9 @@ _Appears in:_ | Field | Description | Default | Validation | | --- | --- | --- | --- | -| `targetResource` _string_ | The `SpannerAutoscaler` resource name with which this schedule will be registered | | | -| `additionalProcessingUnits` _integer_ | The extra compute capacity which will be added when this schedule is active | | | -| `schedule` _[Schedule](#schedule)_ | The details of when and for how long this schedule will be active | | | +| `targetResource` _string_ | The `SpannerAutoscaler` resource name with which this schedule will be registered.
Immutable after creation. | | | +| `additionalProcessingUnits` _integer_ | The extra compute capacity which will be added when this schedule is active.
This is the only field that can be updated after creation. | | | +| `schedule` _[Schedule](#schedule)_ | The details of when and for how long this schedule will be active.
Immutable after creation. | | | #### SpannerAutoscaleScheduleStatus @@ -324,6 +343,8 @@ _Appears in:_ | `desiredMaxPUs` _integer_ | Maximum number of processing units based on the currently active schedules | | | | `instanceState` _[InstanceState](#instancestate)_ | State of the Cloud Spanner instance | | | | `currentHighPriorityCPUUtilization` _integer_ | Current average CPU utilization for high priority task, represented as a percentage | | | +| `currentTotalCPUUtilization` _integer_ | Current total CPU utilization (all priorities), represented as a percentage.
This field is populated only when spec.scaleConfig.targetCPUUtilization.total is specified. | | | +| `currentCPUMetricType` _[CPUMetricType](#cpumetrictype)_ | CurrentCPUMetricType is the CPU metric type that was used in the last sync cycle.
The controller uses this to detect metric-type switches and skip scaling until
the status reflects the newly configured metric type. | | Enum: [HighPriority Total]
| #### TargetCPUUtilization @@ -339,7 +360,8 @@ _Appears in:_ | Field | Description | Default | Validation | | --- | --- | --- | --- | -| `highPriority` _integer_ | Desired CPU utilization for 'High Priority' CPU consumption category. Ref: [Spanner CPU utilization](https://cloud.google.com/spanner/docs/cpu-utilization#task-priority) | | ExclusiveMaximum: true
ExclusiveMinimum: true
Maximum: 100
Minimum: 0
| +| `highPriority` _integer_ | Desired CPU utilization for 'High Priority' CPU consumption category. Ref: [Spanner CPU utilization](https://cloud.google.com/spanner/docs/cpu-utilization#task-priority)
Mutually exclusive with 'total'. Exactly one of 'highPriority' or 'total' must be specified. | | ExclusiveMaximum: true
ExclusiveMinimum: true
Maximum: 100
Minimum: 0
Optional: \{\}
| +| `total` _integer_ | Desired total CPU utilization (all priorities combined). Ref: [Spanner CPU utilization](https://cloud.google.com/spanner/docs/cpu-utilization)
Mutually exclusive with 'highPriority'. Exactly one of 'highPriority' or 'total' must be specified. | | ExclusiveMaximum: true
ExclusiveMinimum: true
Maximum: 100
Minimum: 0
Optional: \{\}
| #### TargetInstance