4x osd nodes with 8x 15.3TB nvme Pools are replica pools (2x) Pg allocation is around 40pg/osd Istvan Szabo Senior Infrastructure Engineer --------------------------------------------------- Agoda Services Co., Ltd. e: istvan.szabo@xxxxxxxxx --------------------------------------------------- -----Original Message----- From: Anthony D'Atri <anthony.datri@xxxxxxxxx> Sent: Thursday, November 11, 2021 7:05 AM To: ceph-users@xxxxxxx Subject: Re: snaptrim blocks io on ceph pacific even on fast NVMEs Email received from the internet. If in doubt, don't click any link nor open any attachment ! ________________________________ > How many osd you have on 1 nvme drives? > We increased 2/nvme to 4/nvme and it improved the snap-trimming quite a lot. Interesting. Most analyses I’ve seen report diminishing returns with more than two OSDs per. There are definitely serialization bottlenecks in the PG and OSD code, so I’m curious re the number and size of the NVMe devices you’re using, and especially their PG ratio. Not lowballing the PGs per OSD can have a similar effect with less impact to CPU and RAM. ymmv. > I guess the utilisation of the nvmes when you snaptrim is not 100%. Take the iostat %util field with a grain of salt, like the load average. Both are traditional metrics whose meanings have diffused as systems have evolved over the years. — aad _______________________________________________ ceph-users mailing list -- ceph-users@xxxxxxx To unsubscribe send an email to ceph-users-leave@xxxxxxx _______________________________________________ ceph-users mailing list -- ceph-users@xxxxxxx To unsubscribe send an email to ceph-users-leave@xxxxxxx