meta data for this page
  •  

Differences

This shows you the differences between two versions of the page.

Link to this comparison view

Both sides previous revisionPrevious revision
Next revision
Previous revision
vm:proxmox:ceph:performance [2025/10/31 18:57] niziakvm:proxmox:ceph:performance [2026/06/07 21:13] (current) niziak
Line 7: Line 7:
   * [[https://forum.proxmox.com/threads/ceph-storage-performance.129408/#post-566971|Ceph Storage Performance]]   * [[https://forum.proxmox.com/threads/ceph-storage-performance.129408/#post-566971|Ceph Storage Performance]]
   * [[https://ceph.io/en/news/blog/2024/ceph-a-journey-to-1tibps/|Ceph: A Journey to 1 TiB/s]]   * [[https://ceph.io/en/news/blog/2024/ceph-a-journey-to-1tibps/|Ceph: A Journey to 1 TiB/s]]
 +  * [[https://www.boniface.me/posts/pvc-ceph-tuning-adventures/]]
  
 ===== Performance tips ===== ===== Performance tips =====
Line 12: Line 13:
 Ceph is build for scale and works great in large clusters. In small cluster every node will be heavily loaded. Ceph is build for scale and works great in large clusters. In small cluster every node will be heavily loaded.
  
 +  * ceph ensure data safety - it waits for data to be written to medium on all replicas. Use enterpise SSDs with battery PLP (Power Loss Protection) to reduce latency. Some people reports 8x speed increase.
   * adapt PG to number of OSDs to spread traffic evenly   * adapt PG to number of OSDs to spread traffic evenly
   * use ''krbd''   * use ''krbd''
 +  * more OSD = better parallelism
   * enable ''writeback'' on VMs (possible data loss on consumer SSDs)   * enable ''writeback'' on VMs (possible data loss on consumer SSDs)
   * MTU 9000 (jumbo frames) [[https://ceph.io/en/news/blog/2015/ceph-loves-jumbo-frames/|Ceph Loves Jumbo Frames]]   * MTU 9000 (jumbo frames) [[https://ceph.io/en/news/blog/2015/ceph-loves-jumbo-frames/|Ceph Loves Jumbo Frames]]
-  * [[https://ceph.io/en/news/blog/2024/ceph-a-journey-to-1tibps/|Ceph: A Journey to 1 TiB/s]] +  * net latency <200us (''ping -s 1000 pve''
-    * Ceph is incredibly sensitive to latency introduced by CPU c-state transitions. Set ''Max perf'' in BIOS to disable C-States or boot Linux with ''''/+  * C-States: [[https://ceph.io/en/news/blog/2024/ceph-a-journey-to-1tibps/|Ceph: A Journey to 1 TiB/s]] 
 +    * Ceph is incredibly sensitive to latency introduced by CPU c-state transitions. Set ''Max perf'' in BIOS to disable C-States or boot Linux with ''GRUB_CMDLINE_LINUX="idle=poll intel_idle.max_cstate=0 intel_pstate=disable processor.max_cstate=1" ''
     * Disable IOMMU in kernel     * Disable IOMMU in kernel
  
Line 114: Line 118:
 ceph config get osd osd_journal_size ceph config get osd osd_journal_size
 5120 5120
 +</code>
 +
 +==== bluestore_min_alloc_size ====
 +
 +  * Read: [[https://docs.ceph.com/en/reef/rados/configuration/bluestore-config-ref/#minimum-allocation-size]]
 +  * Restart of OSD needed
 +  * Impact: A smaller value reduces space waste (space amplification) but increases metadata overhead, while a larger value helps with large sequential writes but wastes space on small files.
 +  * These settings are generally applied to new or freshly deployed OSDs
 +
 +<code bash>
 +# ceph tell 'osd.*' config show | grep bluestore_min_alloc
 +    "bluestore_min_alloc_size": "0",
 +    "bluestore_min_alloc_size_hdd": "4096",
 +    "bluestore_min_alloc_size_ssd": "4096",
 +
 +# ceph tell 'osd.*' config set global bluestore_min_alloc_size_hdd 16384
 +</code>
 +
 +==== filestore_op_threads ====
 +
 +<code bash>
 +# ceph tell 'osd.*' config show | grep filestore_op_threads
 +
 +"filestore_op_threads": "2"
 +# ceph tell 'osd.*' config set filestore_op_threads 4
 +
 </code> </code>