Skip to content

Backup & restore

UDS Core provides cluster backup and restore capabilities through Velero, an open-source tool for backing up Kubernetes resources and persistent volume data. The backup layer is what enables platform operators to recover from data loss, cluster corruption, or infrastructure failure without losing application state.

Application teams should not need to design backup strategies for each service they deploy. Backup belongs at the platform layer because:

  • Consistency — a cluster-level backup captures all namespaces and volumes in a coordinated way, avoiding split-brain scenarios where application data and Kubernetes state diverge
  • Recovery testing — the platform defines and tests restore procedures; application teams rely on the guarantee rather than each maintaining their own
  • Compliance — regulated environments require documented, tested backup and recovery capabilities with defined RPO (recovery point objective — how much data you can afford to lose) and RTO (recovery time objective — how long you can afford to be down) targets
ComponentRole
VeleroOrchestrates scheduled backups of Kubernetes resources and coordinates volume snapshots
Object storage (S3/MinIO)Stores serialized resource manifests (Deployments, ConfigMaps, Secrets, UDS CRs, etc.)
Cloud provider snapshot APICaptures persistent volume state via EBS, Azure Disk, vSphere, or CSI-compatible snapshots

Kubernetes resource backup — Velero captures the state of Kubernetes objects: Deployments, StatefulSets, ConfigMaps, Secrets, PersistentVolumeClaims, and custom resources (including UDS Package and Exemption CRs). These are stored as serialized object manifests in an object store.

Volume snapshot backup — Velero integrates with cloud provider volume snapshot APIs (AWS EBS, Azure Disk, vSphere) to capture the on-disk state of persistent volumes at a point in time. Volume snapshots are coordinated with the resource backup so that application data and Kubernetes state are consistent.

Velero runs backups on a configurable cron schedule, with retention controlled per-backup via a --ttl flag.

Teams can customize the schedule, retention, and scope to match their RTO/RPO requirements — for example, adding more frequent snapshots for critical namespaces or extending retention for compliance.

ScenarioWhen to use
Namespace-level restoreSingle application namespace was accidentally deleted or corrupted; other workloads are unaffected
Cluster-level restoreCatastrophic infrastructure failure; provision new infrastructure and restore all namespaces from the most recent backup
Point-in-time restoreCorruption or data loss discovered after the fact; restore to a snapshot from before the event occurred

Velero requires a storage provider plugin and appropriate permissions to perform volume snapshots. UDS Core’s backup layer is configured at bundle deploy time with the target storage provider and destination. Velero supports cloud-native snapshot APIs (AWS EBS, Azure Disk, vSphere) as well as CSI-compatible storage that supports the volume snapshot API for on-premises deployments. See the Velero supported providers documentation for the full list of available plugins.