This could be again because of tcmalloc issue I reported earlier. Two things to observe. 1. Is the performance improving if you stop IO on other volume ? If so, it could be different issue. 2. Run perf top in the OSD node and see if tcmalloc traces are popping up. Thanks & Regards Somnath -----Original Message----- From: ceph-users [mailto:ceph-users-bounces@xxxxxxxxxxxxxx] On Behalf Of Nikola Ciprich Sent: Friday, April 24, 2015 7:10 AM To: ceph-users@xxxxxxxxxxxxxx Cc: nik@xxxxxxxxxxx Subject: very different performance on two volumes in the same pool Hello, I'm trying to solve a bit mysterious situation: I've got 3 nodes CEPH cluster, and pool made of 3 OSDs (each on one node), OSDs are 1TB SSD drives. pool has 3 replicas set. I'm measuring random IO performance using fio: fio --randrepeat=1 --ioengine=rbd --direct=1 --gtod_reduce=1 --name=test --pool=ssd3r --rbdname=${rbdname} --invalidate=1 --bs=4k --iodepth=64 --readwrite=randread --output=randio.log it's giving very nice performance of ~ 186K IOPS for random read. the problem is, I've got one volume on which it fives only ~20K IOPS and I can't figure why. It's created using python, so I first suspected it can be similar to missing layerign problem I was consulting here few days ago, but when I tried reproducing it, I'm beting ~180K IOPS even for another volumes created using python. so there is only this one problematic, others are fine. Since there is only one SSD in each box and I'm using 3 replicas, there should not be any difference in physical storage used between volumes.. I'm using hammer, 0.94.1, fio 2.2.6. here's RBD info: "slow" volume: [root@vfnphav1a fio]# rbd info ssd3r/vmtst23-6 rbd image 'vmtst23-6': size 30720 MB in 7680 objects order 22 (4096 kB objects) block_name_prefix: rbd_data.1376d82ae8944a format: 2 features: flags: "fast" volume: [root@vfnphav1a fio]# rbd info ssd3r/vmtst23-7 rbd image 'vmtst23-7': size 30720 MB in 7680 objects order 22 (4096 kB objects) block_name_prefix: rbd_data.13d01d2ae8944a format: 2 features: flags: any idea on what could be wrong here? thanks a lot in advance! BR nik -- ------------------------------------- Ing. Nikola CIPRICH LinuxBox.cz, s.r.o. 28.rijna 168, 709 00 Ostrava tel.: +420 591 166 214 fax: +420 596 621 273 mobil: +420 777 093 799 www.linuxbox.cz mobil servis: +420 737 238 656 email servis: servis@xxxxxxxxxxx ------------------------------------- ________________________________ PLEASE NOTE: The information contained in this electronic mail message is intended only for the use of the designated recipient(s) named above. If the reader of this message is not the intended recipient, you are hereby notified that you have received this message in error and that any review, dissemination, distribution, or copying of this message is strictly prohibited. If you have received this communication in error, please notify the sender by telephone or e-mail (as shown above) immediately and destroy any and all copies of this message in your possession (whether hard copies or electronically stored copies). _______________________________________________ ceph-users mailing list ceph-users@xxxxxxxxxxxxxx http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com