meta data for this page
Differences
This shows you the differences between two versions of the page.
| Both sides previous revisionPrevious revisionNext revision | Previous revision | ||
| vm:proxmox:ceph:performance [2024/05/17 19:33] – niziak | vm:proxmox:ceph:performance [2025/11/01 10:36] (current) – niziak | ||
|---|---|---|---|
| Line 6: | Line 6: | ||
| * [[https:// | * [[https:// | ||
| * [[https:// | * [[https:// | ||
| + | * [[https:// | ||
| + | * [[https:// | ||
| ===== Performance tips ===== | ===== Performance tips ===== | ||
| Line 13: | Line 15: | ||
| * adapt PG to number of OSDs to spread traffic evenly | * adapt PG to number of OSDs to spread traffic evenly | ||
| * use '' | * use '' | ||
| + | * more OSD = better parallelism | ||
| * enable '' | * enable '' | ||
| + | * MTU 9000 (jumbo frames) [[https:// | ||
| + | * net latency <200us ('' | ||
| + | * [[https:// | ||
| + | * Ceph is incredibly sensitive to latency introduced by CPU c-state transitions. Set '' | ||
| + | * Disable IOMMU in kernel | ||
| ==== performance on small cluster ==== | ==== performance on small cluster ==== | ||
| Line 24: | Line 32: | ||
| * primary PG - original/ | * primary PG - original/ | ||
| * use relatively more PG than for big cluster - better balance, but handling PGs consumes resources (RAM) | * use relatively more PG than for big cluster - better balance, but handling PGs consumes resources (RAM) | ||
| + | * i.e. for 7 OSD x 2TB PG autoscaler recommends 256 PG. After changing to 384 IOops drastivally increases and latency drops. | ||
| + | Setting to 512 PG wasn't possible because limit of 250PG/OSD. | ||
| === balancer === | === balancer === | ||
| Line 53: | Line 63: | ||
| ==== check cluster balance ==== | ==== check cluster balance ==== | ||
| + | <code bash> | ||
| ceph -s | ceph -s | ||
| - | ceph osd df - shows standard deviation | + | ceph osd df # shows standard deviation |
| + | </ | ||
| no tools to show primary PG balancing. Tool on https:// | no tools to show primary PG balancing. Tool on https:// | ||
| + | |||
| + | ==== fragmentation ==== | ||
| + | |||
| + | <code bash> | ||
| + | # ceph tell ' | ||
| + | osd.0: { | ||
| + | " | ||
| + | } | ||
| + | osd.1: { | ||
| + | " | ||
| + | } | ||
| + | osd.2: { | ||
| + | " | ||
| + | } | ||
| + | osd.3: { | ||
| + | " | ||
| + | } | ||
| + | osd.4: { | ||
| + | " | ||
| + | } | ||
| + | osd.5: { | ||
| + | " | ||
| + | } | ||
| + | osd.6: { | ||
| + | " | ||
| + | } | ||
| + | </ | ||
| + | |||
| ==== performance on slow HDDs ==== | ==== performance on slow HDDs ==== | ||
| + | Do not keep '' | ||
| + | <code bash> | ||
| + | ceph config set osd osd_memory_target 4294967296 | ||
| + | ceph config get osd osd_memory_target | ||
| + | 4294967296 | ||
| + | </ | ||
| + | |||
| + | If journal is on SSD, change low_threshold to sth bigger - NOTE - check if is valid for BLuestore, probably this is legacy paramater for Filestore: | ||
| + | <code bash> | ||
| + | # internal parameter calculated from other parameters: | ||
| + | ceph config get osd journal_throttle_low_threshhold | ||
| + | 0.600000 | ||
| + | |||
| + | # 5GB: | ||
| + | ceph config get osd osd_journal_size | ||
| + | 5120 | ||
| + | </ | ||
| + | |||
| + | === mClock scheduler === | ||
| + | |||
| + | * [[https:// | ||
| + | |||
| + | * [[https:// | ||
| + | * [[https:// | ||
| + | |||
| + | Upon startup ceph mClock scheduler performs benchmarking of storage and configure IOPS according to results: | ||
| + | |||
| + | |||
| + | <code bash> | ||
| + | # ceph tell ' | ||
| + | " | ||
| + | " | ||
| + | " | ||
| + | " | ||
| + | " | ||
| + | " | ||
| + | " | ||
| + | |||
| + | # ceph tell ' | ||
| + | " | ||
| + | " | ||
| + | </ | ||
| + | |||
| + | Manual benchmark: | ||
| + | <code bash> | ||
| + | ceph tell ' | ||
| + | </ | ||
| + | |||
| + | Override settings: | ||
| + | |||
| + | <code bash> | ||
| + | ceph config dump | grep osd_mclock_max_capacity_iops | ||
| + | |||
| + | for i in $(seq 0 7); do ceph config rm osd.$i osd_mclock_max_capacity_iops_hdd; | ||
| + | ceph config set global osd_mclock_max_capacity_iops_hdd 111 | ||
| + | |||
| + | ceph config dump | grep osd_mclock_max_capacity_iops | ||
| + | </ | ||
| + | |||
| + | == mClock profiles == | ||
| + | |||
| + | <code bash> | ||
| + | ceph tell ' | ||
| + | </ | ||
| + | |||
| + | <code bash> | ||
| + | ceph tell ' | ||
| + | |||
| + | ceph tell ' | ||
| + | </ | ||
| + | |||
| + | == mClock custom profile == | ||
| + | |||
| + | <code bash> | ||
| + | ceph tell ' | ||
| + | </ | ||