Hi, Cephers. I would like to hear your ideas about strange situation we have in one of our clusters. It's Luminous 12.2.12 cluster. Recently we added 3 nodes with 10x SSD OSDs to it and dedicated them to SSD pool for our OpenStack volumes. Initial tests went well, IOPS were great, throughput was perfect - all good. Until we got first real usage there. Very limited IOPS (~450), high disk utilization (near 100%) and throughput (less than 1 MB/s) put us into the tears. After some investigation we found, that this situation only occurs when all conditions are met: 1. Disk is RBD (test went fine from same server with local disks) 2. File system is XFS (no problems with ext4) 3. Block size is bigger than write 4. Only one FIO thread (numjobs) is used When at least one of these conditions are not met - we get ~40k IOPS, great throughput, etc. We did tests with fio, testing different values, but it's quite clear: if write size is 4kb (same as block size) iops go up to 40k. If write size is 3kb, then it limits to ~450 iops. From this point, it doesn't matter how small the write is - it's always ~450 iops. After changing block size to 2kb, situation is same - great speed until write is less than 2kb in size. If we rise fio paramether "numjobs" to 10 we get maximum possible iops: ~40k. Which is more than simple 10x increase. Any ideas what is going on and why smaller writes take such a big impact on performance in XFS, but no problems in EXT4? Thank you for all the ideas! Arvydas _______________________________________________ ceph-users mailing list -- ceph-users@xxxxxxx To unsubscribe send an email to ceph-users-leave@xxxxxxx