====== CEPH performance ======

  * [[https://docs.ceph.com/en/latest/rados/configuration/bluestore-config-ref/#sizing|BlueStore Config Reference: Sizing]]
  * [[https://yourcmc.ru/wiki/Ceph_performance]]
  * [[https://accelazh.github.io/ceph/Ceph-Performance-Tuning-Checklist|Ceph Performance Tuning Checklist]]
  * [[https://www.reddit.com/r/ceph/comments/zpk0wo/new_to_ceph_hdd_pool_is_extremely_slow/|New to Ceph, HDD pool is extremely slow]]
  * [[https://forum.proxmox.com/threads/ceph-storage-performance.129408/#post-566971|Ceph Storage Performance]]

===== Performance tips =====

Ceph is build for scale and works great in large clusters. In small cluster every node will be heavily loaded.

  * adapt PG to number of OSDs to spread traffic evenly
  * use ''krbd''
  * enable ''writeback'' on VMs (possible data loss on consumer SSDs)

==== performance on small cluster ====

  * [[https://www.youtube.com/watch?v=LlLLJxNcVOY|Configuring Small Ceph Clusters for Optimal Performance - Josh Salomon, Red Hat]]
  * number of PG should be power of 2 (or middle between powers of 2)
  * same utilization (% full) per device
  * same number of PG per OSD := same number of request per device
  * same number of primary PG per OSD = read operations spread evenly
    * primary PG - original/first PG - others are replicas. Primary PG is used for read.
  * use relatively more PG than for big cluster - better balance, but handling PGs consumes resources (RAM)
    * i.e. for 7 OSD x 2TB PG autoscaler recommends 256 PG. After changing to 384 IOops drastivally increases and latency drops.
      Setting to 512 PG wasn't possible because limit of 250PG/OSD.

=== balancer ===

<code bash>
ceph mgr module enable balancer
ceph balancer on
ceph balancer mode upmap
</code>

=== CRUSH reweight ===

If possible use ''balancer''

Override default CRUSH assignment.


=== PG autoscaler ===

Better to use in warn mode, to do not put unexpected load when PG number will change.
<code bash>
ceph mgr module enable pg_autoscaler
#ceph osd pool set <pool> pg_autoscale_mode <mode>
ceph osd pool set rbd pg_autoscale_mode warn
</code>

It is possible to set desired/target size of pool. This prevents autoscaler to move data every time new data are stored.

==== check cluster balance ====

<code bash>
ceph -s
ceph osd df # shows standard deviation
</code>

no tools to show primary PG balancing. Tool on https://github.com/JoshSalomon/Cephalocon-2019/blob/master/pool_pgs_osd.sh


==== performance on slow HDDs ====