Enable volume snapshots (vSphere CSI)
What you’ll accomplish
Section titled “What you’ll accomplish”You’ll enable Velero to capture persistent volume data using vSphere CSI snapshots on an RKE2 cluster, so your backups include both Kubernetes resources and on-disk application state.
Prerequisites
Section titled “Prerequisites”- UDS CLI installed
- Access to an RKE2 cluster with UDS Core deployed
- Velero storage backend configured (see Configure Velero storage backends)
- vSphere environment with a user account that has the required CSI roles and privileges (see Broadcom vSphere Roles and Privileges)
- Ability to apply
HelmChartConfigoverrides to RKE2 system charts
Before you begin
Section titled “Before you begin”By default, UDS Core backs up Kubernetes resources only. Volume snapshots are disabled:
| Setting | Default |
|---|---|
snapshotsEnabled | false |
schedules.udsbackup.template.snapshotVolumes | false |
-
Install and configure the vSphere CSI driver
On your RKE2 cluster, set the cloud provider in your RKE2 configuration:
config.yaml cloud-provider-name: rancher-vsphereProvide
HelmChartConfigoverrides for the CPI and CSI drivers. Three CSI overrides are critical:blockVolumeSnapshotmust be enabled,configTemplatemust be overridden to include the snapshot limit, andglobal-max-snapshots-per-block-volumemust be set high enough for your retention policy.helmchartconfig.yaml ---apiVersion: helm.cattle.io/v1kind: HelmChartConfigmetadata:name: rancher-vsphere-cpinamespace: kube-systemspec:valuesContent: |-vCenter:host: "<vsphere-server>"port: 443insecureFlag: truedatacenters: "<vsphere-datacenter-name>"username: "<vsphere-csi-username>"password: "<vsphere-csi-password>"credentialsSecret:name: "vsphere-cpi-creds"generate: true---apiVersion: helm.cattle.io/v1kind: HelmChartConfigmetadata:name: rancher-vsphere-csinamespace: kube-systemspec:valuesContent: |-vCenter:datacenters: "<vsphere-datacenter-name>"username: "<vsphere-csi-username>"password: "<vsphere-csi-password>"configSecret:configTemplate: |[Global]cluster-id = "<rke2-cluster-id>"user = "<vsphere-csi-username>"password = "<vsphere-csi-password>"port = 443insecure-flag = "1"[VirtualCenter "<vsphere-server>"]datacenters = "<vsphere-datacenter-name>"[Snapshot]global-max-snapshots-per-block-volume = 12csiNode:tolerations:- operator: "Exists"effect: "NoSchedule"blockVolumeSnapshot:enabled: truestorageClass:reclaimPolicy: Retain -
Create a VolumeSnapshotClass
Define a
VolumeSnapshotClassthat tells Velero how to create snapshots using the vSphere CSI driver. Deploy this as a manifest in a Zarf package included in your bundle:volumesnapshotclass.yaml apiVersion: snapshot.storage.k8s.io/v1kind: VolumeSnapshotClassmetadata:name: vsphere-csi-snapshot-classlabels:velero.io/csi-volumesnapshot-class: "true"driver: csi.vsphere.vmware.comdeletionPolicy: Retain -
Enable CSI snapshots in Velero
Add the following overrides to enable CSI-based volume snapshots:
uds-bundle.yaml packages:- name: corerepository: registry.defenseunicorns.com/public/coreref: x.x.x-upstreamoverrides:velero:velero:values:- path: configuration.featuresvalue: EnableCSI- path: snapshotsEnabledvalue: true- path: configuration.volumeSnapshotLocationvalue:- name: defaultprovider: velero.io/csi- path: schedules.udsbackup.template.snapshotVolumesvalue: true -
Create and deploy your bundle
Terminal window uds create <path-to-bundle-dir>uds deploy uds-bundle-<name>-<arch>-<version>.tar.zst
Verification
Section titled “Verification”# Verify snapshots are enabled on the scheduleuds zarf tools kubectl get schedule -n velero velero-udsbackup -o jsonpath='{.spec.template.snapshotVolumes}'
# Verify the VolumeSnapshotLocation existsuds zarf tools kubectl get volumesnapshotlocation -n velero
# After a backup completes, check for volume snapshotsuds zarf tools kubectl get volumesnapshot -ASuccess criteria:
snapshotVolumesistrueon the schedule- A VolumeSnapshotLocation with provider
velero.io/csiexists in theveleronamespace - After a backup completes, VolumeSnapshot resources are created for each PVC
- Snapshot count matches the number of PVCs in backed-up namespaces
To trigger a manual backup for testing, see Perform a manual backup.
Troubleshooting
Section titled “Troubleshooting”Problem: Snapshot limit reached
Section titled “Problem: Snapshot limit reached”Symptoms: Backups fail with a FailedPrecondition error in the Velero logs:
error executing custom action: rpc error: code = FailedPrecondition desc =the number of snapshots on the source volume reaches the configured maximum (3)Solution: Increase global-max-snapshots-per-block-volume in the vSphere CSI HelmChartConfig. A value of at least 10 is required for the default 10-day retention, with 12 recommended for buffer. See the snapshot limit guidance in Before you begin and update the [Snapshot] section in the CSI configTemplate in step 1.
Problem: VolumeSnapshotContents remain after backup deletion
Section titled “Problem: VolumeSnapshotContents remain after backup deletion”Symptoms: Deleting a backup does not clean up the associated VolumeSnapshotContents in Kubernetes or in vSphere.
Solution: Be cautious when deleting backups that have been used for restores — Velero may attempt to delete VolumeSnapshotContents that are still in use by restored volumes. Velero’s garbage collection runs hourly by default.
Related documentation
Section titled “Related documentation”- Velero: CSI Snapshot Support — CSI integration details and configuration
- Kubernetes: Volume Snapshots — CSI snapshot API reference
- Rancher vSphere Charts — CPI and CSI driver Helm charts
- vSphere CSI Snapshot Limits — snapshot per volume configuration
- Backup & restore concepts — how Velero fits into UDS Core
Next steps
Section titled “Next steps”These guides and concepts may be useful to explore next: