My Rancher setup looks like this:
- 1 Local (Imported) RKE Cluster where Rancher is running
- 1 User RKE Cluster where client Workloads are running
I have 3 control+etcd Nodes in each RKE Cluster.
I perform etcd backups for both RKE clusters. All etcd backups are stored on S3 and locally on each control+etcd node under /opt/rke/etcd-snapshots/.
Additionally in Rancher 2.5.X there is a rancher-backup operator that performs a backup of the Rancher App.
I would like to be able to restore everything like it was before in case of a disaster:
- Installed Apps with Helm
- Workloads, Configs, Secrets, …
What are the steps needed to restore Rancher 2.5.X from backup assuming everything went down and you have to create everything from scratch ?
Something like this maybe:
- Spin up a new Local (Imported) Cluster with Rancher running on it.
- Do an rke etcd snapshot-restore <<local_snapshot>> for the Local (Imported) Cluster.
I wasn’t able to find an official procedure for restoring the Local RKE Cluster. Is this correct ?
- Do a Rancher App backup with the rancher-backup operator and kind: Restore resource.
- Spin up a new User Cluster.
- Do an rke etcd snapshot-restore <<user_cluster_snapshot>> on the User Cluster.
Is the order of the steps correct ? Local Cluster → Rancher App → User Cluster ?