> [...] All-SSD cluster I will get roughly 400 IOPS over more > than 250 devices. I’ve know SAS-SSDs are not ideal, but 250 > looks a bit on the low side of things to me. In the second > cluster, also All-SSD based, I get roughly 120 4k IOPS. And > the HDD-only cluster delivers 60 4k IOPS. Regardless of the specifics: 4KiB write IOPS is definitely not what Ceph was designed for. Yet so many people know better and use Ceph for VM disk images, even with logs and databases on them. > [...] „ceph tell osd bench“ with 4k blocks yields 30k+ IOPS > for every single device in the big cluster, and all that leads > to is 400 IOPS in total when writing to it? Even with no > replication in place? [..] Checks to do: * If those are SAS SSDs they must have persistent on-device caches ("power loss protections) so ensure that they have synchronous writes disabled. * What is the definition of the metadata pool and of the data pool? * Are you actually measuring the rate of _metadata_ (object creation and deletion) or of _data_ operations? * Do the MON logs report "slow ops"? * Run 'iotop' and 'top' on MON and one OSD while running the benchmark. * You mentioned 'iostat': run 'iostat -dk -zxy /dev/sd* 1' on an OSD during the benchmark too. * Run something like 'nuttcp'/'iperf' between the one MON and one OSD and between one OSD and one client. * Run 10 and 100 parallel "ceph bench" with 4KiB blocks. * Run 1 and 10 "ceph bench" with 64KiB and 1MiB blocks. The last two because Ceph does not promise big single-thread speed, but does much better over *many* threads. _______________________________________________ ceph-users mailing list -- ceph-users@xxxxxxx To unsubscribe send an email to ceph-users-leave@xxxxxxx