Skip to main content

ClusterBootstrap

A ClusterBootstrap drives the end-to-end provisioning of a Butler management cluster from bare metal or cloud VMs through to a fully operational platform.

API Version

butler.butlerlabs.dev/v1alpha1

Scope

Namespaced

Short Name

cb

Description

ClusterBootstrap is the central resource for management cluster provisioning. It coordinates VM creation, Talos Linux configuration, Kubernetes bootstrap, and addon installation across both on-prem and cloud providers.

The bootstrap controller runs inside a temporary KIND cluster on the operator's workstation. It watches ClusterBootstrap resources and reconciles through a strict phase sequence:

  1. Create MachineRequest resources for each node
  2. Wait for provider controllers to provision VMs and report IPs
  3. Generate and apply Talos machine configurations
  4. Bootstrap the first control plane node
  5. Install platform addons in dependency order
  6. (HA only) Pivot the management plane onto the new cluster

On-prem providers (Harvester, Nutanix, Proxmox) use kube-vip for control plane HA with a floating VIP. Cloud providers (GCP, AWS, Azure) create a LoadBalancerRequest to provision a cloud-native L4 load balancer as the control plane endpoint.

Specification

Full Example (HA On-Prem)

apiVersion: butler.butlerlabs.dev/v1alpha1
kind: ClusterBootstrap
metadata:
name: butler-mgmt
namespace: butler-system
spec:
provider: harvester
providerRef:
name: harvester-prod
namespace: butler-system

cluster:
name: butler-mgmt
topology: ha
controlPlane:
replicas: 3
cpu: 4
memoryMB: 16384
diskGB: 100
workers:
replicas: 3
cpu: 8
memoryMB: 32768
diskGB: 200
extraDisks:
- sizeGB: 500
storageClass: longhorn

network:
podCIDR: 10.244.0.0/16
serviceCIDR: 10.96.0.0/12
vip: "10.40.0.200"
vipInterface: eth0
loadBalancerPool:
start: "10.40.0.210"
end: "10.40.0.250"

talos:
version: v1.9.2
schematic: ce4c980550dd2ab1b17bbf2b08801c7eb59418eafe8f279833297925d67c7515
installDisk: /dev/vda

addons:
cni:
type: cilium
hubbleEnabled: true
storage:
type: longhorn
replicaCount: 3
loadBalancer:
type: metallb
controlPlaneHA:
type: kube-vip
certManager:
enabled: true
ingress:
type: traefik
enabled: true
controlPlaneProvider:
type: steward
enabled: true
capi:
enabled: true
version: v1.9.4
butlerController:
enabled: true
gitOps:
type: flux
enabled: true

controlPlaneExposure:
mode: LoadBalancer

Spec Fields

FieldTypeRequiredDefaultDescription
providerstringYes--Infrastructure provider. One of: harvester, nutanix, proxmox, gcp, aws, azure.
providerRefProviderReferenceYes--References the ProviderConfig with infrastructure credentials.
clusterClusterBootstrapClusterSpecYes--Cluster topology and node sizing.
networkClusterBootstrapNetworkSpecYes--Pod CIDR, service CIDR, VIP, load balancer pool.
talosClusterBootstrapTalosSpecYes--Talos Linux version, schematic, install disk, config patches.
addonsClusterBootstrapAddonsSpecNoSee belowPlatform addons to install.
controlPlaneExposureControlPlaneExposureSpecNoLoadBalancerHow tenant control planes are exposed after bootstrap. Written to ButlerConfig.
pausedboolNofalsePauses reconciliation when true.

ClusterBootstrapClusterSpec

FieldTypeRequiredDefaultDescription
namestringYes--Cluster name. DNS-safe, 1-63 chars, pattern ^[a-z0-9]([-a-z0-9]*[a-z0-9])?$.
topologystringNohaha for high-availability (3+ CP nodes + workers) or single-node (1 CP, no workers).
controlPlaneClusterBootstrapNodePoolYes--Control plane node pool.
workersClusterBootstrapNodePoolNo--Worker node pool. Ignored when topology is single-node.

ClusterBootstrapNodePool

FieldTypeRequiredValidationDescription
replicasint32Yes1-10Number of nodes. Forced to 1 for single-node topology.
cpuint32Yes1-128CPU cores per node.
memoryMBint32Yesmin 2048Memory in megabytes per node.
diskGBint32Yesmin 20Root disk size in gigabytes per node.
extraDisks[]DiskSpecNo--Additional disks. Each has sizeGB (min 1) and optional storageClass.
labelsmap[string]stringNo--Labels applied to nodes in this pool.

ClusterBootstrapNetworkSpec

FieldTypeRequiredDefaultDescription
podCIDRstringYes--CIDR for pod networking. Pattern: ^([0-9]{1,3}\.){3}[0-9]{1,3}/[0-9]{1,2}$.
serviceCIDRstringYes--CIDR for service networking. Same pattern.
vipstringNo--Control plane endpoint. For on-prem: a floating IP managed by kube-vip. For cloud: optional, set automatically from LoadBalancerRequest endpoint. Accepts IP addresses and DNS hostnames.
vipInterfacestringNo--Network interface for the VIP. Only relevant for on-prem with kube-vip. Auto-detected by kube-vip if not specified.
loadBalancerPoolLoadBalancerPoolSpecNo--IP range for MetalLB. Must not overlap with VIP.

LoadBalancerPoolSpec:

FieldTypeRequiredDescription
startstringYesFirst IP in the pool (inclusive).
endstringYesLast IP in the pool (inclusive).

Validation: start must be less than or equal to end. If vip is an IP address (not a hostname), it must not fall within the pool range.

ClusterBootstrapTalosSpec

FieldTypeRequiredDefaultDescription
versionstringYes--Talos version. Pattern: ^v[0-9]+\.[0-9]+\.[0-9]+$.
schematicstringYes--Talos factory schematic ID for the boot image.
installDiskstringNo/dev/vdaDisk device for Talos installation.
configPatches[]TalosConfigPatchNo--Inline Talos config patches (RFC 6902 JSON Patch format).

TalosConfigPatch:

FieldTypeRequiredDescription
opstringYesPatch operation: add, remove, or replace.
pathstringYesJSON path to patch.
valuestringNoValue to set (required for add and replace).

ClusterBootstrapAddonsSpec

All addon sub-specs are optional. Defaults are applied when the field is omitted.

FieldTypeDefaultDescription
cniCNIAddonSpectype: ciliumContainer networking.
storageStorageAddonSpectype: longhorn, replicaCount: 3Persistent storage.
loadBalancerLoadBalancerAddonSpectype: metallbLoadBalancer service implementation.
controlPlaneHAControlPlaneHAAddonSpectype: kube-vipControl plane HA (on-prem only).
certManagerCertManagerAddonSpecenabled: trueTLS certificate automation.
ingressIngressAddonSpectype: traefik, enabled: trueIngress controller.
controlPlaneProviderControlPlaneProviderAddonSpectype: steward, enabled: trueHosted control plane operator.
capiCAPIAddonSpecenabled: true, version: v1.9.4Cluster API core + infrastructure providers.
butlerControllerButlerControllerAddonSpecenabled: true, version: latestButler platform controller.
gitOpsGitOpsAddonSpectype: flux, enabled: trueGitOps controller.
consoleConsoleAddonSpecenabled: falseButler web console.

The CNI, Storage, CAPI, and Console sub-specs are detailed below. The remaining sub-specs (LoadBalancer, ControlPlaneHA, CertManager, Ingress, ControlPlaneProvider, GitOps, ButlerController) follow the same pattern: enabled (bool), type or provider (string), and version (string).

CNIAddonSpec

FieldTypeDefaultDescription
typestringciliumCNI type. Enum: cilium, none.
versionstring--Override chart version.
hubbleEnabledbooltrueEnable Hubble observability (Cilium only).

StorageAddonSpec

FieldTypeDefaultDescription
typestringlonghornStorage type. Enum: longhorn, none.
versionstring--Override chart version.
replicaCount*int323Default volume replica count. Forced to 1 for single-node topology.

CAPIAddonSpec

FieldTypeDefaultDescription
enabled*booltrueInstall Cluster API.
versionstringv1.9.4CAPI core version.
infrastructureProviders[]CAPIInfraProviderSpec--Additional CAPI infrastructure providers. The management cluster's own provider is always included.

CAPIInfraProviderSpec:

FieldTypeRequiredDescription
namestringYesProvider name. Enum: harvester, nutanix, proxmox, gcp, aws, azure.
versionstringNoOverride provider version.
credentialsSecretRefSecretReferenceNoCredentials for providers other than the management cluster's own.

ConsoleAddonSpec

FieldTypeDefaultDescription
enabled*boolfalseInstall Butler Console.
versionstringlatestConsole image tag.
ingressConsoleIngressSpec--Ingress configuration for the console.

ConsoleIngressSpec:

FieldTypeDefaultDescription
enabledboolfalseCreate an Ingress resource for the console.
hoststringbutler.<cluster>.localHostname for console access.
classNamestring--Ingress class (e.g., traefik, nginx).
tlsboolfalseEnable TLS termination.
tlsSecretNamestring--Name of TLS Secret.

ControlPlaneExposureSpec

Configures how tenant control planes are exposed after bootstrap. This setting is written to the ButlerConfig singleton and inherited by all TenantClusters.

FieldTypeDefaultDescription
modestringLoadBalancerExposure mode. Enum: LoadBalancer, Ingress, Gateway.
hostnamestring--Wildcard domain for tenant API servers (e.g., *.k8s.platform.example.com). Required for Ingress and Gateway modes.
ingressClassNamestring--Ingress class for Ingress mode.
controllerTypestring--Ingress controller type for TLS passthrough. Enum: haproxy, nginx, traefik, generic.
gatewayRefstring--Gateway resource reference for Gateway mode (format: namespace/name). Required for Gateway mode.

Status

status:
phase: Ready
controlPlaneEndpoint: "10.40.0.200"
kubeconfig: "base64-encoded-kubeconfig..."
talosconfig: "base64-encoded-talosconfig..."
consoleURL: "https://butler.mgmt.example.com"
machines:
- name: butler-mgmt-cp-0
role: control-plane
phase: Running
ipAddress: "10.40.0.10"
talosConfigured: true
ready: true
- name: butler-mgmt-cp-1
role: control-plane
phase: Running
ipAddress: "10.40.0.11"
talosConfigured: true
ready: true
- name: butler-mgmt-cp-2
role: control-plane
phase: Running
ipAddress: "10.40.0.12"
talosConfigured: true
ready: true
addonsInstalled:
cilium: true
cert-manager: true
longhorn: true
metallb: true
kube-vip: true
steward: true
capi: true
butler-controller: true
conditions:
- type: Ready
status: "True"
lastTransitionTime: "2026-03-10T12:30:00Z"
reason: "BootstrapComplete"
lastUpdated: "2026-03-10T12:30:00Z"
observedGeneration: 1

Status Fields

FieldTypeDescription
phaseClusterBootstrapPhaseCurrent lifecycle phase (see Phases below).
controlPlaneEndpointstringAPI server endpoint (VIP for on-prem, LB IP/DNS for cloud).
kubeconfigstringBase64-encoded admin kubeconfig for the new cluster.
talosconfigstringBase64-encoded talosconfig for Talos API access.
consoleURLstringURL of the Butler Console (if installed).
machines[]ClusterBootstrapMachineStatusPer-machine status.
failureReasonstringMachine-readable failure reason.
failureMessagestringHuman-readable failure details.
addonsInstalledmap[string]boolTracks which addons completed installation.
conditions[]ConditionStandard Kubernetes conditions.
lastUpdatedTimeTimestamp of the last status update.
observedGenerationint64Last observed spec generation.

ClusterBootstrapMachineStatus

FieldTypeDescription
namestringMachineRequest name.
rolestringcontrol-plane or worker.
phasestringMachineRequest phase (Pending, Creating, Running, Failed).
ipAddressstringAssigned IP address.
talosConfiguredboolTrue after Talos config has been applied to the node.
readyboolTrue after the node has joined the Kubernetes cluster.

Phases

PhaseDescription
PendingClusterBootstrap created, not yet reconciled.
ProvisioningMachinesMachineRequest resources created, waiting for VMs to report IPs.
ConfiguringTalosAll VMs have IPs. Generating and applying Talos machine configurations.
BootstrappingClusterTalos configs applied. Bootstrapping etcd and Kubernetes on the first control plane node.
InstallingAddonsKubernetes API available. Installing platform addons in dependency order.
Pivoting(HA only) Moving management plane onto the new cluster.
ReadyBootstrap complete. Management cluster is fully operational.
FailedBootstrap failed. Check failureReason and failureMessage.

Conditions

TypeDescription
ReadyTrue when bootstrap is complete and the cluster is operational.
ProgressingTrue while bootstrap is actively working.
FailedTrue when bootstrap has encountered a terminal failure.

On-Prem vs Cloud Bootstrap

The bootstrap flow adapts based on the provider type:

AspectOn-Prem (Harvester, Nutanix, Proxmox)Cloud (GCP, AWS, Azure)
Control plane HAkube-vip floating VIPCloud L4 load balancer via LoadBalancerRequest
CP endpoint sourcenetwork.vip fieldLoadBalancerRequest status.endpoint
MetalLBInstalled for LoadBalancer servicesNot installed (no loadBalancerPool configured for cloud providers)
kube-vipInstalledSkipped
Loopback patchNot neededApplied to each CP node so kube-apiserver accepts LB-routed packets
TraefikInstalled for ingressSkipped

For cloud providers, the bootstrap controller creates a LoadBalancerRequest resource after VMs are provisioned. The cloud provider controller provisions the load balancer and reports its endpoint. That endpoint becomes the controlPlaneEndpoint used in Talos machine configs.

Validation Rules

CEL validation rules enforce these constraints:

  • When controlPlaneExposure.mode is Ingress, hostname must be set.
  • When controlPlaneExposure.mode is Gateway, both hostname and gatewayRef must be set.
  • network.vip must not fall within network.loadBalancerPool range (validated in Go, prevents kube-vip and MetalLB conflicts).

Topology Comparison

Single-NodeHA
Control plane nodes13 (recommended)
Worker nodes0 (CP is schedulable)1+
etcdSingle member3-member cluster
Storage replicasForced to 1Default 3
kube-vipSkippedInstalled (on-prem)
PivotingSkippedEnabled
Use caseDev, testing, edgeProduction

Finalizer

FinalizerPurpose
clusterbootstrap.butler.butlerlabs.dev/finalizerEnsures cleanup of MachineRequests, LoadBalancerRequests, and Secrets before CR deletion.

kubectl Output

$ kubectl get cb -n butler-system
NAME CLUSTER TOPOLOGY PHASE ENDPOINT AGE
butler-mgmt butler-mgmt ha Ready 10.40.0.200 2h

Examples

Single-Node Development

apiVersion: butler.butlerlabs.dev/v1alpha1
kind: ClusterBootstrap
metadata:
name: dev-cluster
namespace: butler-system
spec:
provider: harvester
providerRef:
name: harvester-dev
namespace: butler-system
cluster:
name: dev-cluster
topology: single-node
controlPlane:
replicas: 1
cpu: 4
memoryMB: 16384
diskGB: 100
network:
podCIDR: 10.244.0.0/16
serviceCIDR: 10.96.0.0/12
vip: "10.40.0.100"
talos:
version: v1.9.2
schematic: ce4c980550dd2ab1b17bbf2b08801c7eb59418eafe8f279833297925d67c7515

HA Cloud (GCP)

apiVersion: butler.butlerlabs.dev/v1alpha1
kind: ClusterBootstrap
metadata:
name: butler-prod
namespace: butler-system
spec:
provider: gcp
providerRef:
name: gcp-prod
namespace: butler-system
cluster:
name: butler-prod
topology: ha
controlPlane:
replicas: 3
cpu: 4
memoryMB: 16384
diskGB: 100
workers:
replicas: 3
cpu: 8
memoryMB: 32768
diskGB: 200
network:
podCIDR: 10.244.0.0/16
serviceCIDR: 10.96.0.0/12
talos:
version: v1.9.2
schematic: ce4c980550dd2ab1b17bbf2b08801c7eb59418eafe8f279833297925d67c7515
addons:
cni:
type: cilium
hubbleEnabled: true
storage:
type: longhorn
controlPlaneProvider:
type: steward
capi:
enabled: true
infrastructureProviders:
- name: gcp
controlPlaneExposure:
mode: LoadBalancer

HA On-Prem with Gateway Exposure

apiVersion: butler.butlerlabs.dev/v1alpha1
kind: ClusterBootstrap
metadata:
name: butler-mgmt
namespace: butler-system
spec:
provider: nutanix
providerRef:
name: nutanix-dc1
namespace: butler-system
cluster:
name: butler-mgmt
topology: ha
controlPlane:
replicas: 3
cpu: 4
memoryMB: 16384
diskGB: 100
workers:
replicas: 3
cpu: 8
memoryMB: 32768
diskGB: 200
network:
podCIDR: 10.244.0.0/16
serviceCIDR: 10.96.0.0/12
vip: "10.0.0.200"
loadBalancerPool:
start: "10.0.0.210"
end: "10.0.0.250"
talos:
version: v1.9.2
schematic: ce4c980550dd2ab1b17bbf2b08801c7eb59418eafe8f279833297925d67c7515
addons:
ingress:
type: traefik
console:
enabled: true
ingress:
enabled: true
host: butler.platform.example.com
className: traefik
tls: true
controlPlaneExposure:
mode: Gateway
hostname: "*.k8s.platform.example.com"
gatewayRef: "butler-system/tenant-gateway"

See Also