meta data for this page
  •  

Differences

This shows you the differences between two versions of the page.

Link to this comparison view

Both sides previous revisionPrevious revision
Next revision
Previous revision
vm:proxmox:ceph:performance [2023/05/31 09:22] niziakvm:proxmox:ceph:performance [2025/01/08 19:05] (current) niziak
Line 4: Line 4:
   * [[https://yourcmc.ru/wiki/Ceph_performance]]   * [[https://yourcmc.ru/wiki/Ceph_performance]]
   * [[https://accelazh.github.io/ceph/Ceph-Performance-Tuning-Checklist|Ceph Performance Tuning Checklist]]   * [[https://accelazh.github.io/ceph/Ceph-Performance-Tuning-Checklist|Ceph Performance Tuning Checklist]]
 +  * [[https://www.reddit.com/r/ceph/comments/zpk0wo/new_to_ceph_hdd_pool_is_extremely_slow/|New to Ceph, HDD pool is extremely slow]]
 +  * [[https://forum.proxmox.com/threads/ceph-storage-performance.129408/#post-566971|Ceph Storage Performance]]
  
 +===== Performance tips =====
  
-==== Issues ====+Ceph is build for scale and works great in large clusters. In small cluster every node will be heavily loaded.
  
-=== auth: unable to find a keyring ===+  * adapt PG to number of OSDs to spread traffic evenly 
 +  * use ''krbd'' 
 +  * enable ''writeback'' on VMs (possible data loss on consumer SSDs)
  
-It is not possible to create ceph OSD neither from WebUI nor cmdline: <code bash>pveceph osd create /dev/sdc</code>+==== performance on small cluster ====
  
-<code> +  * [[https://www.youtube.com/watch?v=LlLLJxNcVOY|Configuring Small Ceph Clusters for Optimal Performance Josh Salomon, Red Hat]] 
-Running command: /bin/ceph --cluster ceph --name client.bootstrap-osd --keyring /var/lib/ceph/bootstrap-osd/ceph.keyring mon getmap -o /var/lib/ceph/osd/ceph-2/activate.monmap +  * number of PG should be power of (or middle between powers of 2) 
- stderr2021-01-28T10:21:24.996+0100 7fd1a848f700 -1 auth: unable to find a keyring on /etc/pve/priv/ceph.client.bootstrap-osd.keyring: (2No such file or directory +  * same utilization (% full) per device 
-2021-01-28T10:21:24.996+0100 7fd1a848f700 -1 AuthRegistry(0x7fd1a0059030) no keyring found at /etc/pve/priv/ceph.client.bootstrap-osd.keyring, disabling cephx+  * same number of PG per OSD := same number of request per device 
 +  * same number of primary PG per OSD = read operations spread evenly 
 +    * primary PG original/first PG others are replicasPrimary PG is used for read. 
 +  * use relatively more PG than for big cluster better balance, but handling PGs consumes resources (RAM
 +    * i.efor 7 OSD x 2TB PG autoscaler recommends 256 PGAfter changing to 384 IOops drastivally increases and latency drops. 
 +      Setting to 512 PG wasn't possible because limit of 250PG/OSD. 
 + 
 +=== balancer === 
 + 
 +<code bash> 
 +ceph mgr module enable balancer 
 +ceph balancer on 
 +ceph balancer mode upmap
 </code> </code>
  
-<file init /etc/pve/ceph.conf> +=== CRUSH reweight ===
-[client] +
-         keyring /etc/pve/priv/$cluster.$name.keyring+
  
-[mds] +If possible use ''balancer''
-         keyring = /var/lib/ceph/mds/ceph-$id/keyring +
-</file>+
  
-ceph.conf Variables +Override default CRUSH assignment.
-  * **$cluster** - cluster name. For proxmox it is ''ceph'' +
-  * **$type** - daemon process ''mds'' ''osd'' ''mon'' +
-  * **$id** - daemon or client indentifier. For ''osd.0'' it is ''0'' +
-  * **$host** - hostname where the process is running +
-  * **$name** - Expands to $type.$id. I.e: ''osd.2'' or ''client.bootstrap'' +
-  * **$pid** - Expands to daemon pid+
  
-**SOLUTION:** 
-<code bash>cp /var/lib/ceph/bootstrap-osd/ceph.keyring /etc/pve/priv/ceph.client.bootstrap-osd.keyring</code> 
-alternative to try: change ceph.conf 
  
-=== Unit -.mount is masked. ===+=== PG autoscaler ===
  
-<code> +Better to use in warn mode, to do not put unexpected load when PG number will change. 
-Running command: /usr/bin/systemctl start ceph-osd@2 +<code bash
- stderr: Failed to start ceph-osd@2.service: Unit -.mount is masked. +ceph mgr module enable pg_autoscaler 
--- RuntimeError: command returned non-zero exit status: 1+#ceph osd pool set <poolpg_autoscale_mode <mode> 
 +ceph osd pool set rbd pg_autoscale_mode warn
 </code> </code>
  
-It was caused by ''gparted'' which wasn't correctly shutdown. +It is possible to set desired/target size of poolThis prevents autoscaler to move data every time new data are stored.
-  * [[https://askubuntu.com/questions/1191596/unit-mount-is-masked|Unit -.mount is masked]] +
-  * [[https://bugs.debian.org/cgi-bin/bugreport.cgi?bug=948739|gparted should not mask .mount units]] +
-  * [[https://unix.stackexchange.com/questions/533933/systemd-cant-unmask-root-mount-mount/548996]]+
  
-**Solution:**+==== check cluster balance ====
  
-<code bash>systemctl --runtime unmask -- -.mount</code>+<code bash> 
 +ceph -
 +ceph osd df # shows standard deviation 
 +</code>
  
-To list runtime masked units: +no tools to show primary PG balancing. Tool on https://github.com/JoshSalomon/Cephalocon-2019/blob/master/pool_pgs_osd.sh
-<code bash>ls -l /var/run/systemd/system | grep mount | grep '/dev/null' | cut -d ' ' -f 11</code>+
  
-To unescape systemd unit names:  
-<code bash> systemd-escape -u 'rpool-data-basevol\x2d800\x2ddisk\x2d0.mount'</code> 
  
 +==== performance on slow HDDs ====