Hi, I think not that's related, but how full is your ceph-cluster? Perhaps it's has something to do with the fragmentation on the xfs-filesystem (xfs_db -c frag -r device)? Udo Am 08.05.2014 02:57, schrieb Christian Balzer: > > Hello, > > ceph 0.72 on Debian Jessie, 2 storage nodes with 2 OSDs each. The journals > are on (separate) DC 3700s, the actual OSDs are RAID6 behind an Areca 1882 > with 4GB of cache. > > Running this fio: > > fio --size=400m --ioengine=libaio --invalidate=1 --direct=1 --numjobs=1 --rw=randwrite --name=fiojob --blocksize=4k --iodepth=128 > > results in: > > 30k IOPS on the journal SSD (as expected) > 110k IOPS on the OSD (it fits neatly into the cache, no surprise there) > 3200 IOPS from a VM using userspace RBD > 2900 IOPS from a host kernelspace mounted RBD > > When running the fio from the VM RBD the utilization of the journals is > about 20% (2400 IOPS) and the OSDs are bored at 2% (1500 IOPS after some > obvious merging). > The OSD processes are quite busy, reading well over 200% on atop, but > the system is not CPU or otherwise resource starved at that moment. > > Running multiple instances of this test from several VMs on different hosts > changes nothing, as in the aggregated IOPS for the whole cluster will > still be around 3200 IOPS. > > Now clearly RBD has to deal with latency here, but the network is IPoIB > with the associated low latency and the journal SSDs are the > (consistently) fasted ones around. > > I guess what I am wondering about is if this is normal and to be expected > or if not where all that potential performance got lost. > > Regards, > > Christian >