issues

failed to get canonical path of

/usr/sbin/grub-probe: error: failed to get canonical path of `/dev/disk/by-id/scsi-35000000000000278-part3'.

From beginning all SATA drives were linked in /dev/disk/by-id as:

  • ata-???
  • scsi-???
  • wwn-???

After inserting first SAS drive to mixed array, Kernel assigns scsi-??? only for SAS drives. So initial zpool configuration was wrong using ''scsi-???' ids.

zpool status -P rpool
 
	NAME                                                            STATE     READ WRITE CKSUM
	rpool                                                           ONLINE       0     0     0
	  mirror-2                                                      ONLINE       0     0     0
	    /dev/disk/by-partuuid/240b1914-4002-4b48-8644-2cefeee091f2  ONLINE       0     0     0
	    /dev/disk/by-id/scsi-35000000000000278-part3                ONLINE       0     0     0
 
zpool offline rpool scsi-35000000000000278-part3
zpool detach rpool scsi-35000000000000278-part3
zpool attach rpool 240b1914-4002-4b48-8644-2cefeee091f2 wwn-0x5000000000000278-part3

removal pending data

“pending removals”

Removed garbage: 0 B
Removed chunks: 0
Pending removals: 647.072 GiB (in 301488 chunks)
Original data usage: 119.233 TiB

However, to prevent data blocks from being deleted prematurely due to clock skew (or relatime) or ongoing backup operations, PBS has designed a grace period of 24 hours and 5 minutes.

the atime is there to protect chunks that have been added by backups running concurrently to the GC task. the GC task will only treat chunks as "used" that a re referenced by indices in the datastore. a backup snapshot currently being created doesn't have any valid indices yet (those are finalized when the backup is finished), so to protect the chunks newly added by such snapshots GC will only consider chunks eligible for removal that are older than 24h or the oldest backup worker running at the start of the GC (whichever is earlier). the 24h are because depending on the local setup, atime might not be updated again if the last update (GC/..) was within the last 24h.
The 24 hours 5 minutes is because of relatime, as with that the atime is only updated once every 24 hours and PBS uses the atime do decide what chunks are still in use and which not.

Backup tasks hang