Skip to main content

TenantCluster

A TenantCluster represents a Kubernetes cluster managed by Butler for running tenant workloads.

API Version

butler.butlerlabs.dev/v1alpha1

Scope

Namespaced

Short Name

tc

Description

TenantCluster is the primary resource users interact with to provision Kubernetes clusters. When a TenantCluster is created, Butler:

  1. Creates a hosted control plane via Steward (TenantControlPlane)
  2. Provisions worker VMs via the configured infrastructure provider
  3. Bootstraps workers to join the cluster
  4. Installs platform addons (CNI, LoadBalancer, etc.)

Specification

Full Example

apiVersion: butler.butlerlabs.dev/v1alpha1
kind: TenantCluster
metadata:
name: my-cluster
namespace: team-backend
labels:
butler.butlerlabs.dev/team: backend-team
spec:
kubernetesVersion: "v1.31.0"

teamRef:
name: backend-team

controlPlane:
replicas: 1
externalCloudProvider: true
resources:
apiServer:
requests:
cpu: "100m"
memory: "256Mi"
limits:
cpu: "2"
memory: "1Gi"

workers:
replicas: 3
machineTemplate:
cpu: 4
memory: "16Gi"
diskSize: "100Gi"
os:
type: talos
version: "v1.12.2"

providerConfigRef:
name: harvester-prod

networking:
podCIDR: 10.244.0.0/16
serviceCIDR: 10.96.0.0/12
loadBalancerPool:
start: "10.40.1.100"
end: "10.40.1.120"

addons:
cni:
provider: cilium
version: "1.17.0"
loadBalancer:
provider: metallb
version: "0.14.9"
storage:
provider: longhorn
version: "1.7.2"
certManager:
enabled: true
version: "1.16.3"

Spec Fields

FieldTypeRequiredDescription
kubernetesVersionstringYesKubernetes version in format vX.Y.Z (e.g., "v1.31.0")
teamRefobjectNoReference to the Team resource. Required when multi-tenancy mode is Enforced
controlPlaneobjectNoControl plane configuration
workersobjectYesWorker node configuration
providerConfigRefobjectNoReference to ProviderConfig. Falls back to Team or platform default
networkingobjectNoNetwork CIDR and load balancer pool configuration
addonsobjectNoAddon configuration
managementPolicyobjectNoHow Butler manages this cluster after initial setup
infrastructureOverrideobjectNoPer-cluster overrides for provider-specific settings
workspacesobjectNoCloud development environment configuration

controlPlane

FieldTypeRequiredDefaultDescription
replicasintegerNo1API server replicas (1-3)
dataStoreRefobjectNoReference to Steward DataStore
serviceTypestringNoLoadBalancer, NodePort, or ClusterIP. Inherits from ButlerConfig if not set
certSANsarrayNoAdditional SANs for API server cert
externalCloudProvider*boolNotrueEnable --cloud-provider=external on apiserver and controller-manager. Required for Harvester, vSphere, and cloud providers
resourcesobjectNoPer-component resource limits. Overrides ButlerConfig defaults

controlPlane.resources

Per-component resource limits for control plane pods. If a component is set here, it fully replaces the ButlerConfig default for that component. Components not set here inherit from ButlerConfig.spec.defaultControlPlaneResources.

controlPlane:
resources:
apiServer:
requests:
cpu: "100m"
memory: "256Mi"
limits:
cpu: "2"
memory: "1Gi"
controllerManager:
requests:
cpu: "50m"
memory: "64Mi"
limits:
cpu: "500m"
memory: "256Mi"
scheduler:
requests:
cpu: "25m"
memory: "32Mi"
limits:
cpu: "250m"
memory: "128Mi"

workers

FieldTypeRequiredDescription
replicasintegerYesNumber of worker nodes (minimum 1)
machineTemplateobjectNoMachine specifications

workers.machineTemplate

FieldTypeRequiredDefaultDescription
cpuintegerNo4CPU cores per worker
memoryquantityNo16GiMemory per worker
diskSizequantityNo100GiRoot disk size per worker
osobjectNoOperating system configuration

workers.machineTemplate.os

Butler supports multiple operating systems for worker nodes. The OS type determines how the node is bootstrapped and joins the cluster.

FieldTypeRequiredDefaultDescription
typestringNorockyOS type: talos, rocky, flatcar, bottlerocket, kairos
versionstringNo9.5OS version
imageRefstringNoSpecific image reference. Overrides type and version
schematicIDstringNoButler Image Factory schematic ID. Enables automatic image syncing to the target provider
sshAuthorizedKeystringNoSSH public key for node access. Falls back to ButlerConfig default. Not applicable to Talos
talosobjectNoTalos-specific configuration. Required when type is talos

OS Types:

TypeBootstrap MethodDescription
talosTalos machine config via dataSecretName, post-boot apply-config via talosctlImmutable Linux. Addons wait for config-applied annotation on all workers
rockyCABPK KubeadmConfigTemplate via configRef, kubeadm join via cloud-initRocky Linux. Max K8s version: v1.30.2 (v1.31+ causes BootstrapFailed on Harvester)
flatcarIgnition JSON via dataSecretName, kubelet auto-joins via bootstrap tokenFlatcar Container Linux. Immutable
bottlerocketTOML settings via dataSecretNameBottlerocket. Immutable
kairoscloud-config YAML via dataSecretNameKairos. Immutable, cloud-config based

workers.machineTemplate.os.talos

FieldTypeRequiredDefaultDescription
installDiskstringNo/dev/vdaDisk device for Talos installation
installerImagestringNoTalos installer image (e.g., factory.talos.dev/installer/<schematic>:v1.12.2)
versionstringNov1.9.3Talos version

providerConfigRef

FieldTypeRequiredDescription
namestringYesName of the ProviderConfig
namespacestringNoNamespace of the ProviderConfig. Defaults to butler-system

networking

FieldTypeRequiredDefaultDescription
podCIDRstringNo10.244.0.0/16Pod network CIDR
serviceCIDRstringNo10.96.0.0/12Service network CIDR
loadBalancerPoolobjectNoIP pool for LoadBalancer services. Auto-populated when IPAM is active
lbPoolSizeintegerNoOverride the default LB pool size from the provider. Only used with IPAM

networking.loadBalancerPool

FieldTypeRequiredDescription
startstringYesFirst IP address in the pool
endstringYesLast IP address in the pool

addons

Addons are configured as structured fields, not an array. Each addon has a version field that is required.

FieldTypeDescription
cniobjectCNI configuration (Cilium)
loadBalancerobjectLoad balancer (MetalLB)
certManagerobjectCertificate management
storageobjectPersistent storage (Longhorn, Linstor)
ingressobjectIngress controller (Traefik, Nginx)
gitopsobjectGitOps (Flux or ArgoCD)

addons.cni

FieldTypeRequiredDefaultDescription
providerstringNociliumCNI provider. Only cilium is currently supported
versionstringYesCilium version (e.g., 1.17.0)
valuesobjectNoCustom Helm values

addons.loadBalancer

FieldTypeRequiredDefaultDescription
providerstringNometallbLoad balancer provider. Only metallb is currently supported
versionstringYesMetalLB version (e.g., 0.14.9)
valuesobjectNoCustom Helm values

addons.certManager

FieldTypeRequiredDefaultDescription
enabledboolNotrueWhether to install cert-manager
versionstringYescert-manager version (e.g., 1.16.3)
valuesobjectNoCustom Helm values

addons.storage

FieldTypeRequiredDefaultDescription
providerstringNoStorage provider: longhorn or linstor
versionstringYesStorage provider version
valuesobjectNoCustom Helm values

addons.ingress

FieldTypeRequiredDefaultDescription
providerstringNoIngress controller: traefik or nginx
versionstringYesIngress controller version
valuesobjectNoCustom Helm values

addons.gitops

FieldTypeRequiredDefaultDescription
providerstringNoGitOps tool: fluxcd or argocd
versionstringNoGitOps tool version
repositoryobjectNoGit repository configuration
addons:
gitops:
provider: fluxcd
version: "2.4.0"
repository:
url: https://github.com/org/clusters
branch: main
path: clusters/my-cluster
secretRef:
name: git-credentials

managementPolicy

FieldTypeRequiredDefaultDescription
modestringNoActiveManagement mode: Active, Observe, or GitOps
  • Active: Butler actively manages addons. New addons in spec are installed.
  • Observe: Butler only observes after initial install. Spec changes are ignored.
  • GitOps: Butler bootstraps Flux and hands off addon management to GitOps.

infrastructureOverride

Per-cluster overrides for provider-specific settings. These take precedence over ProviderConfig defaults. Only the section matching the cluster's provider is used.

infrastructureOverride:
harvester:
namespace: custom-namespace
networkName: default/vlan50-dev
imageName: default/talos-v1-12-2
nutanix:
clusterUUID: "..."
subnetUUID: "..."
imageUUID: "..."
storageContainerUUID: "..."
proxmox:
node: pve-node-03
storage: ceph-ssd
templateID: 9001
gcp:
zone: us-central1-b
machineType: n2-standard-8
image: talos-v1-12-5-custom
subnetwork: custom-subnet

Status

The status subresource tracks the current state of the cluster.

status:
phase: Ready
controlPlaneEndpoint: "10.40.0.201:6443"
workerNodesReady: 3
workerNodesDesired: 3
tenantNamespace: "tenant-my-cluster"
kubeconfigSecretRef:
name: my-cluster-kubeconfig
observedState:
kubernetesVersion: "v1.31.0"
workers:
desired: 3
ready: 3
nodes:
- my-cluster-worker-0
- my-cluster-worker-1
- my-cluster-worker-2
addons:
- name: cilium
version: "1.17.0"
status: Healthy
managedBy: butler

Status Fields

FieldTypeDescription
phasestringCurrent lifecycle phase
controlPlaneEndpointstringAPI server endpoint
workerNodesReadyintegerNumber of ready worker nodes
workerNodesDesiredintegerDesired number of workers
tenantNamespacestringNamespace containing CAPI/Steward resources
kubeconfigSecretRefobjectReference to kubeconfig Secret
observedStateobjectObserved cluster state
ipAllocationRefobjectReference to the node IP allocation from IPAM
lbAllocationRefobjectReference to the load balancer IP allocation from IPAM
imageSyncRefobjectReference to the ImageSync resource for this cluster's OS image

Phases

PhaseDescription
PendingCR created, awaiting reconciliation
ProvisioningCreating control plane and workers
InstallingInstalling platform addons
ReadyCluster fully operational
UpdatingProcessing spec changes
DeletingCleaning up resources
FailedError state (check conditions)

Conditions

ConditionDescription
InfrastructureReadyCAPI resources are ready
ControlPlaneReadyControl plane pods are running
WorkersReadyAll worker nodes have joined
AddonsReadyPlatform addons are healthy
NetworkReadyIP allocation is complete
ProviderAccessGrantedProvider scope check passed
ImageReadyOS image is synced to the provider
ReadyOverall cluster readiness

Labels

Butler automatically applies these labels:

LabelDescription
butler.butlerlabs.dev/teamTeam name
butler.butlerlabs.dev/tenantCluster name
app.kubernetes.io/managed-byAlways "butler"

Finalizers

FinalizerDescription
butler.butlerlabs.dev/tenantclusterEnsures cleanup of child resources

Examples

Minimal Cluster

apiVersion: butler.butlerlabs.dev/v1alpha1
kind: TenantCluster
metadata:
name: dev-cluster
namespace: butler-tenants
spec:
kubernetesVersion: "v1.31.0"
workers:
replicas: 2
addons:
cni:
provider: cilium
version: "1.17.0"
loadBalancer:
provider: metallb
version: "0.14.9"

Production Cluster with Talos Workers

apiVersion: butler.butlerlabs.dev/v1alpha1
kind: TenantCluster
metadata:
name: prod-api
namespace: team-backend
spec:
kubernetesVersion: "v1.31.0"
controlPlane:
replicas: 3
externalCloudProvider: true
resources:
apiServer:
requests:
cpu: "500m"
memory: "512Mi"
limits:
cpu: "4"
memory: "2Gi"
workers:
replicas: 5
machineTemplate:
cpu: 8
memory: "32Gi"
diskSize: "200Gi"
os:
type: talos
version: "v1.12.2"
schematicID: "71e06ba76d3cf365bb4ab4d8f8f4fea55a7620811666b9c25623734ab18ddd27"
providerConfigRef:
name: harvester-prod
networking:
loadBalancerPool:
start: "10.40.2.100"
end: "10.40.2.150"
addons:
cni:
provider: cilium
version: "1.17.0"
loadBalancer:
provider: metallb
version: "0.14.9"
storage:
provider: longhorn
version: "1.7.2"
certManager:
enabled: true
version: "1.16.3"
ingress:
provider: traefik
version: "3.3.0"
gitops:
provider: fluxcd
version: "2.4.0"
repository:
url: https://github.com/myorg/clusters
branch: main
path: clusters/prod-api

Multi-OS: Rocky Linux Workers

apiVersion: butler.butlerlabs.dev/v1alpha1
kind: TenantCluster
metadata:
name: rocky-cluster
namespace: team-legacy
spec:
kubernetesVersion: "v1.30.2"
workers:
replicas: 3
machineTemplate:
cpu: 4
memory: "16Gi"
diskSize: "100Gi"
os:
type: rocky
version: "9.5"
addons:
cni:
provider: cilium
version: "1.17.0"
loadBalancer:
provider: metallb
version: "0.14.9"

See Also