Re: Experience with 100G Ceph in Proxmox

Giovanna Ratini <giovanna.ratini@xxxxxxxxxxxxxxx> · Tue, 18 Mar 2025 19:13:49 +0100

Hello Antony,

no, no QoS applied to Vms.

The Server has PCIe Gen 4

ceph osd dump | grep pool
pool 1 '.mgr' replicated size 3 min_size 2 crush_rule 0 object_hash 
rjenkins pg_num 1 pgp_num 1 autoscale_mode on last_change 21 flags 
hashpspool stripe_width 0 pg_num_max 32 pg_num_min 1 application mgr 
read_balance_score 13.04
pool 2 'cephfs_data' replicated size 3 min_size 2 crush_rule 0 
object_hash rjenkins pg_num 32 pgp_num 32 autoscale_mode on last_change 
598 lfor 0/598/596 flags hashpspool stripe_width 0 application cephfs 
read_balance_score 2.02
pool 3 'cephfs_metadata' replicated size 3 min_size 2 crush_rule 0 
object_hash rjenkins pg_num 32 pgp_num 32 autoscale_mode on last_change 
50 flags hashpspool stripe_width 0 pg_autoscale_bias 4 pg_num_min 16 
recovery_priority 5 application cephfs read_balance_score 2.42
pool 4 'cephvm' replicated size 3 min_size 2 crush_rule 0 object_hash 
rjenkins pg_num 128 pgp_num 128 autoscale_mode on last_change 16386 lfor 
0/644/2603 flags hashpspool,selfmanaged_snaps stripe_width 0 application 
rbd read_balance_score 1.52

I think, this is the default config. 🙈

I will search for my chassies supermicro upgrade.

Thank you

Am 18.03.2025 um 17:57 schrieb Anthony D'Atri:
Then I tested on the *Proxmox host*, and the results were significantly better.
My Proxmox prowess is limited, but from my experience with other virtualization platforms, I have to ask if there is any QoS throttling applied to VMs.  With OpenStack or DO  there is often IOPS and/or throughput throttling via libvirt to mitigate noisy neighbors.

  fio --name=host-test --filename=/dev/rbd0 --ioengine=libaio --rw=randread --bs=4k --numjobs=4 --iodepth=32 --size=1G --runtime=60 --group_reporting

*IOPS*: *1.54M*

# *Bandwidth*: *6032MiB/s (6325MB/s)*
# *Latency*:

* *Avg*: *39.8µs*
* *99.9th percentile*: *71µs*

# *CPU Usage*: *usr=22.60%, sys=77.13%*
#

Am 18.03.2025 um 15:27 schrieb Anthony D'Atri:
Which NVMe drive SKUs specifically?
# */dev/nvme6n1* – *KCD61LUL15T3* – 15.36 TB – SN: 6250A02QT5A8
# */dev/nvme5n1* – *KCD61LUL15T3* – 15.36 TB – SN: 42R0A036T5A8
# */dev/nvme4n1* – *KCD61LUL15T3* – 15.36 TB – SN: 6250A02UT5A8
Kioxia CD6.  If you were using client-class drives all manner of performance issues would be expected.

Is your server chassis at least PCIe Gen 4?  If it’s Gen 3 that may hamper these drives.

Also, how many of these are in your cluster?  If it’s a small number you might still benefit from chopping each into at least 2 separate OSDs.

And please send `ceph osd dump | grep pool`, having too few PGs wouldn’t do you any favors.

Are you running a recent kernel?
penultimate: 6.8.12-8-pve (VM, yes)
Groovy.  If you were running like a CentOS 6 or CentOS 7 kernel then NVMe issues might be expected as old kernels had rudimentary NVMe support.

  Have you updated firmware on the NVMe devices?
No.
Kioxia appears to not release firmware updates publicly but your chassis brand (Dell, HP, SMCI, etc) might have an update.
e.g.https://www.dell.com/support/home/en-vc/drivers/driversdetails?driverid=7ny55

  If there is an available update I would strongly suggest applying.

Thanks again,

best regards,
Gio

_______________________________________________
ceph-users mailing list --ceph-users@xxxxxxx
To unsubscribe send an email toceph-users-leave@xxxxxxx
_______________________________________________
ceph-users mailing list -- ceph-users@xxxxxxx
To unsubscribe send an email to ceph-users-leave@xxxxxxx