Backup & Restore

UDS Core provides cluster backup and restore capabilities through Velero, an open-source tool for backing up Kubernetes resources and persistent volume data. The backup layer is what enables platform operators to recover from data loss, cluster corruption, or infrastructure failure without losing application state.

Why backup is a platform concern

Application teams should not need to design backup strategies for each service they deploy. Backup belongs at the platform layer because:

Consistency — a cluster-level backup captures all namespaces and volumes in a coordinated way, avoiding split-brain scenarios where application data and Kubernetes state diverge
Recovery testing — the platform defines and tests restore procedures; application teams rely on the guarantee rather than each maintaining their own
Compliance — regulated environments require documented, tested backup and recovery capabilities with defined RPO (recovery point objective — how much data you can afford to lose) and RTO (recovery time objective — how long you can afford to be down) targets

What Velero backs up

Component	Role
Velero	Orchestrates scheduled backups of Kubernetes resources and coordinates volume snapshots
Object storage (S3/MinIO)	Stores serialized resource manifests (Deployments, ConfigMaps, Secrets, UDS CRs, etc.)
Cloud provider snapshot API	Captures persistent volume state via EBS, Azure Disk, vSphere, or CSI-compatible snapshots

Kubernetes resource backup — Velero captures the state of Kubernetes objects: Deployments, StatefulSets, ConfigMaps, Secrets, PersistentVolumeClaims, and custom resources (including UDS Package and Exemption CRs). These are stored as serialized object manifests in an object store.

Volume snapshot backup — Velero integrates with cloud provider volume snapshot APIs (AWS EBS, Azure Disk, vSphere) to capture the on-disk state of persistent volumes at a point in time. Volume snapshots are coordinated with the resource backup so that application data and Kubernetes state are consistent.

Backup schedule and retention

Velero runs backups on a configurable cron schedule, with retention controlled per-backup via a --ttl flag.

Teams can customize the schedule, retention, and scope to match their RTO/RPO requirements — for example, adding more frequent snapshots for critical namespaces or extending retention for compliance.

Restore scenarios

Scenario	When to use
Namespace-level restore	Single application namespace was accidentally deleted or corrupted; other workloads are unaffected
Cluster-level restore	Catastrophic infrastructure failure; provision new infrastructure and restore all namespaces from the most recent backup
Point-in-time restore	Corruption or data loss discovered after the fact; restore to a snapshot from before the event occurred

What backup does not cover

Storage provider integration

Velero requires a storage provider plugin and appropriate permissions to perform volume snapshots. UDS Core’s backup layer is configured at bundle deploy time with the target storage provider and destination. Velero supports cloud-native snapshot APIs (AWS EBS, Azure Disk, vSphere) as well as CSI-compatible storage that supports the volume snapshot API for on-premises deployments. See the Velero supported providers documentation for the full list of available plugins.