Backup & restore
UDS Core provides cluster backup and restore capabilities through Velero, an open-source tool for backing up Kubernetes resources and persistent volume data. The backup layer is what enables platform operators to recover from data loss, cluster corruption, or infrastructure failure without losing application state.
Why backup is a platform concern
Section titled “Why backup is a platform concern”Application teams should not need to design backup strategies for each service they deploy. Backup belongs at the platform layer because:
- Consistency — a cluster-level backup captures all namespaces and volumes in a coordinated way, avoiding split-brain scenarios where application data and Kubernetes state diverge
- Recovery testing — the platform defines and tests restore procedures; application teams rely on the guarantee rather than each maintaining their own
- Compliance — regulated environments require documented, tested backup and recovery capabilities with defined RPO (recovery point objective — how much data you can afford to lose) and RTO (recovery time objective — how long you can afford to be down) targets
What Velero backs up
Section titled “What Velero backs up”| Component | Role |
|---|---|
| Velero | Orchestrates scheduled backups of Kubernetes resources and coordinates volume snapshots |
| Object storage (S3/MinIO) | Stores serialized resource manifests (Deployments, ConfigMaps, Secrets, UDS CRs, etc.) |
| Cloud provider snapshot API | Captures persistent volume state via EBS, Azure Disk, vSphere, or CSI-compatible snapshots |
Kubernetes resource backup — Velero captures the state of Kubernetes objects: Deployments, StatefulSets, ConfigMaps, Secrets, PersistentVolumeClaims, and custom resources (including UDS Package and Exemption CRs). These are stored as serialized object manifests in an object store.
Volume snapshot backup — Velero integrates with cloud provider volume snapshot APIs (AWS EBS, Azure Disk, vSphere) to capture the on-disk state of persistent volumes at a point in time. Volume snapshots are coordinated with the resource backup so that application data and Kubernetes state are consistent.
Backup schedule and retention
Section titled “Backup schedule and retention”Velero runs backups on a configurable cron schedule, with retention controlled per-backup via a --ttl flag.
Teams can customize the schedule, retention, and scope to match their RTO/RPO requirements — for example, adding more frequent snapshots for critical namespaces or extending retention for compliance.
Restore scenarios
Section titled “Restore scenarios”| Scenario | When to use |
|---|---|
| Namespace-level restore | Single application namespace was accidentally deleted or corrupted; other workloads are unaffected |
| Cluster-level restore | Catastrophic infrastructure failure; provision new infrastructure and restore all namespaces from the most recent backup |
| Point-in-time restore | Corruption or data loss discovered after the fact; restore to a snapshot from before the event occurred |
What backup does not cover
Section titled “What backup does not cover”Storage provider integration
Section titled “Storage provider integration”Velero requires a storage provider plugin and appropriate permissions to perform volume snapshots. UDS Core’s backup layer is configured at bundle deploy time with the target storage provider and destination. Velero supports cloud-native snapshot APIs (AWS EBS, Azure Disk, vSphere) as well as CSI-compatible storage that supports the volume snapshot API for on-premises deployments. See the Velero supported providers documentation for the full list of available plugins.