Skip to main content

Monitor

Butler controllers expose Prometheus metrics on :8080/metrics.

Built-in Metrics

MetricDescription
butler_tenantcluster_countTotal tenant clusters by phase
butler_tenantcluster_phaseCurrent phase per cluster
butler_reconcile_duration_secondsReconciliation latency histogram
butler_addon_install_duration_secondsAddon installation latency histogram

Install Victoria Metrics and Grafana as ManagementAddons:

apiVersion: butler.butlerlabs.dev/v1alpha1
kind: ManagementAddon
metadata:
name: victoria-metrics
spec:
addon: victoria-metrics
---
apiVersion: butler.butlerlabs.dev/v1alpha1
kind: ManagementAddon
metadata:
name: grafana
spec:
addon: grafana

Alerting Rules

groups:
- name: butler
rules:
- alert: TenantClusterFailed
expr: butler_tenantcluster_phase{phase="Failed"} > 0
for: 5m
labels:
severity: critical
annotations:
summary: "Tenant cluster in Failed state"

- alert: HighReconcileLatency
expr: histogram_quantile(0.99, butler_reconcile_duration_seconds_bucket) > 60
for: 10m
labels:
severity: warning
annotations:
summary: "Slow reconciliation detected"

Enable Debug Logging

Set the LOG_LEVEL environment variable on the controller deployment:

kubectl set env deploy/butler-controller -n butler-system LOG_LEVEL=debug