meta data for this page
  •  

Differences

This shows you the differences between two versions of the page.

Link to this comparison view

Both sides previous revisionPrevious revision
Next revision
Previous revision
linux:fs:zfs:tuning [2025/03/04 14:54] niziaklinux:fs:zfs:tuning [2025/08/23 08:45] (current) niziak
Line 1: Line 1:
 ====== ZFS performance tuning tips ====== ====== ZFS performance tuning tips ======
 +
 +Copy-paste snippet:
 +<code bash>
 +zfs set recordsize=1M hddpool
 +zfs set recordsize=1M nvmpool
 +zfs set compression=zstd hddpool
 +zfs set compression=zstd nvmpool
 +</code>
  
 ===== zil limit ===== ===== zil limit =====
  
 ZFS parameter [[https://openzfs.github.io/openzfs-docs/Performance%20and%20Tuning/Module%20Parameters.html#zil-slog-bulk|zil_slog_bulk]] is responsible to ''throttle'' LOG device load. In older ZFS valu was set to 768kB, currently it is 64MB. All sync write requests above this size will be treated as async requests and written directly to slower main device. ZFS parameter [[https://openzfs.github.io/openzfs-docs/Performance%20and%20Tuning/Module%20Parameters.html#zil-slog-bulk|zil_slog_bulk]] is responsible to ''throttle'' LOG device load. In older ZFS valu was set to 768kB, currently it is 64MB. All sync write requests above this size will be treated as async requests and written directly to slower main device.
 +
 +<file ini /etc/modprobe.d/zfs.conf>
 +options zfs zil_slog_bulk=67108864
 +options zfs l2arc_write_max=67108864
 +</file>
  
 See similar for L2ARC: [[https://openzfs.github.io/openzfs-docs/Performance%20and%20Tuning/Module%20Parameters.html#l2arc-write-max|l2arc_write_max]] See similar for L2ARC: [[https://openzfs.github.io/openzfs-docs/Performance%20and%20Tuning/Module%20Parameters.html#l2arc-write-max|l2arc_write_max]]
Line 33: Line 46:
   * less metadata   * less metadata
   * less fragmentation   * less fragmentation
 +  * zvol: huge overhead if guest is using small block sizes - try to match guest FS block size to volblock - do not set 4kB volblock size !
  
 Note: ''recordsize''  / ''volblocksize''  only defines upper limit. Smaller data still can create smaller recordsize (is it true for block?). Note: ''recordsize''  / ''volblocksize''  only defines upper limit. Smaller data still can create smaller recordsize (is it true for block?).
Line 40: Line 54:
   * 16kB for MySQL/InnoDB   * 16kB for MySQL/InnoDB
   * 128kB for rotational HDDs   * 128kB for rotational HDDs
 +
 +Check real usage by histogram:
 +
 +<code bash>
 +zpool iostat -r
 +
 +
 +</code>
 +
 +===== zvol for guest =====
 +
 +  * match volblock size to guest block size
 +  * do not use guest CoW filesystem on CoW (ZFS)
 +  * do not use qcow2 files on ZFS
 +  * use 2 zvols per guest FS - one for storage and second one for journal
  
 ===== Tune L2ARC for backups ===== ===== Tune L2ARC for backups =====
Line 47: Line 76:
 <file conf /etc/modprobe.d/zfs.conf> <file conf /etc/modprobe.d/zfs.conf>
 options zfs l2arc_mfuonly=1 l2arc_noprefetch=0 options zfs l2arc_mfuonly=1 l2arc_noprefetch=0
 +
  
 </file> </file>
Line 88: Line 118:
 echo 2 > /sys/module/zfs/parameters/zfs_vdev_async_write_max_active echo 2 > /sys/module/zfs/parameters/zfs_vdev_async_write_max_active
 cat /sys/module/zfs/parameters/zfs_vdev_async_write_max_active cat /sys/module/zfs/parameters/zfs_vdev_async_write_max_active
 +
  
 </code> </code>
Line 96: Line 127:
 zfs set recordsize=1M hddpool/data zfs set recordsize=1M hddpool/data
 zfs set recordsize=1M hddpool/vz zfs set recordsize=1M hddpool/vz
 +
  
 </code> </code>
Line 120: Line 152:
 zfs rename hddpool/data/vm-156-disk-0 hddpool/data/vm-156-disk-0-backup zfs rename hddpool/data/vm-156-disk-0 hddpool/data/vm-156-disk-0-backup
 zfs rename hddpool/data/vm-156-disk-0-16k hddpool/data/vm-156-disk-0 zfs rename hddpool/data/vm-156-disk-0-16k hddpool/data/vm-156-disk-0
 +
  
 </code> </code>
Line 134: Line 167:
 # ONLY for SSD/NVM devices: # ONLY for SSD/NVM devices:
 zfs set logbias=throughput <pool>/postgres zfs set logbias=throughput <pool>/postgres
 +
  
 </code> </code>
Line 147: Line 181:
     time  read  miss  miss%  dmis  dm%  pmis  pm%  mmis  mm%  size      avail     time  read  miss  miss%  dmis  dm%  pmis  pm%  mmis  mm%  size      avail
 16:47:26              0        0        0        0   15G   15G   1.8G 16:47:26              0        0        0        0   15G   15G   1.8G
 +
  
 </code> </code>
Line 163: Line 198:
         Dnode cache size (hard limit):                 10.0 %    1.2 GiB         Dnode cache size (hard limit):                 10.0 %    1.2 GiB
         Dnode cache size (current):                     5.3 %   63.7 MiB         Dnode cache size (current):                     5.3 %   63.7 MiB
 +
  
 </code> </code>
Line 185: Line 221:
 echo "$[4 * 1024*1024*1024]" >/sys/module/zfs/parameters/zfs_arc_max echo "$[4 * 1024*1024*1024]" >/sys/module/zfs/parameters/zfs_arc_max
 echo "$[128     *1024*1024]" >/sys/module/zfs/parameters/zfs_arc_min echo "$[128     *1024*1024]" >/sys/module/zfs/parameters/zfs_arc_min
 +
  
 </code> </code>
Line 190: Line 227:
 Make options persistent: Make options persistent:
  
-<code  etcmodprobedzfsconf>+<code etcmodprobedzfsconf>
 options zfs zfs_prefetch_disable=1 options zfs zfs_prefetch_disable=1
 options zfs zfs_arc_max=4294967296 options zfs zfs_arc_max=4294967296
 options zfs zfs_arc_min=134217728 options zfs zfs_arc_min=134217728
 options zfs zfs_arc_meta_limit_percent=75 options zfs zfs_arc_meta_limit_percent=75
 +
  
 </code> </code>