meta data for this page
  •  

Differences

This shows you the differences between two versions of the page.

Link to this comparison view

Both sides previous revisionPrevious revision
Next revision
Previous revision
vm:proxmox:disaster_recovery [2021/12/02 08:56] niziakvm:proxmox:disaster_recovery [2024/02/12 08:26] (current) niziak
Line 1: Line 1:
 ====== Disaster recovery ====== ====== Disaster recovery ======
 +
 +===== replace NVM device =====
 +
 +Only 1 NVM slot available, so idea is to copy nvm to hdd and then restore it on new nvm device.
 +
 +Stop CEPH:
 +<code bash>
 +systemctl stop ceph.target
 +systemctl stop ceph-osd.target
 +systemctl stop ceph-mgr.target
 +systemctl stop ceph-mon.target
 +systemctl stop ceph-mds.target
 +systemctl stop ceph-crash.service
 +</code>
 +
 +Backup partition layout
 +<code bash>
 +sgdisk -b nvm.sgdisk /dev/nvme0n1
 +sgdisk -p /dev/nvme0n1
 +</code>
 +
 +Move ZFS nvmpool to hdds:
 +<code bash>
 +zfs destroy hddpool/nvmtemp
 +zfs create -s -b 8192 -V 387.8G hddpool/nvmtemp  # not block size was forced to match existing device
 +
 +ls -l /dev/zvol/hddpool/nvmtemp
 +lrwxrwxrwx 1 root root 11 01-15 11:00 /dev/zvol/hddpool/nvmtemp -> ../../zd192
 +
 +zpool attach nvmpool 7b375b69-3ef9-c94b-bab5-ef68f13df47c /dev/zd192
 +</code>
 +And ''nvmpool'' resilvering will begin. Observe it with ''zpool status nvmpool 1''
 +
 +Remove NVM from ''nvmpool'':
 +<code bash>zpool detach nvmpool 7b375b69-3ef9-c94b-bab5-ef68f13df47c</code>
 +
 +Remove all ZILS, L2ARCs and swap:
 +<code bash>
 +swapoff -a
 +vi /etc/fstab
 +
 +zpool remove hddpool <ZIL DEVICE>
 +zpool remove hddpool <L2ARC DEVICE>
 +zpool remove rpool <L2ARC DEVICE>
 +</code>
 +
 +CEPH OSD will be created from scratch to force to rebuild OSD DB (which can be too big due to metadata bug from previous version of CEPH)
 +
 +Replace NVM.
 +
 +Recreate partitions or restore from backup <code bash>sgdisk -l nvm.sgdisk /dev/nvme0n1</code>
 +  * swap
 +  * rpool_zil
 +  * hddpool_zil
 +  * hddpool_l2arc
 +  * ceph_db (for 4GB ceph OSD create 4096MB+4MB)
 +
 +Add ZILs and L2ARCs.
 +
 +Start ''nvmpool'': <code bash>zpool import nvmpool</code>
 +
 +Move ''nvmpool'' to new NVM partition:
 +<code bash>
 +zpool attach nvmpool zd16 426718f1-1b1e-40c0-a6e2-1332fe5c3f2c
 +zpool detach nvmpool zd16
 +</code>
  
 ===== Replace rpool device ===== ===== Replace rpool device =====
Line 36: Line 102:
 </code> </code>
  
 +<code bash>
 +zpool attach rpool ata-SPCC_Solid_State_Disk_XXXXXXXXXXXX-part3 /dev/disk/by-id/ata-SSDPR-CL100-120-G3_XXXXXXXX-part3
 +zpool offline rpool ata-SSDPR-CX400-128-G2_XXXXXXXXX-part3
 +zpool detach rpool ata-SSDPR-CX400-128-G2_XXXXXXXXX-part3
 +</code>
  
 ===== Migrate VM from dead node ===== ===== Migrate VM from dead node =====