Hello, I have an one node ceph installation and when trying to import an image using qemu, it works fine for some time and after that the osd process starts using ~100% of cpu and the number of op/s increases and the writes decrease dramatically. The osd process doesn't appear as being cpu bound, because there is no cpu maxed out. How can I debug what is causing this? I'm reading from a sata disk and to a pool on ssd. The journal is on ram and I'm using ceph-0.80.7-0.el7.centos.x86_64 Output from top: 26344 root 20 0 825900 314516 18852 S 114.0 0.4 5:03.82 ceph-osd 27547 root 20 0 927364 153440 13044 S 51.7 0.2 2:50.83 qemu-img Writes in ceph when the OSD is using a lot of cpu: 2014-10-30 10:55:08.112259 mon.0 [INF] pgmap v135: 3264 pgs: 3264 active+clean; 27595 MB data, 16116 MB used, 2385 GB / 2405 GB avail; 618 kB/s wr, 1236 op/s 2014-10-30 10:55:13.111438 mon.0 [INF] pgmap v136: 3264 pgs: 3264 active+clean; 27646 MB data, 16174 MB used, 2385 GB / 2405 GB avail; 643 kB/s wr, 1286 op/s 2014-10-30 10:55:18.110992 mon.0 [INF] pgmap v137: 3264 pgs: 3264 active+clean; 27693 MB data, 16195 MB used, 2385 GB / 2405 GB avail; 632 kB/s wr, 1264 op/s 2014-10-30 10:55:23.109454 mon.0 [INF] pgmap v138: 3264 pgs: 3264 active+clean; 27747 MB data, 16195 MB used, 2385 GB / 2405 GB avail; 645 kB/s wr, 1291 op/s Writes in ceph when the OSD is ~10% of cpu: 2014-10-30 09:59:46.140964 mon.0 [INF] pgmap v80: 704 pgs: 704 active+clean; 11935 MB data, 8413 MB used, 2392 GB / 2405 GB avail; 100536 kB/s wr, 98 op/s 2014-10-30 09:59:51.134338 mon.0 [INF] pgmap v81: 704 pgs: 704 active+clean; 12575 MB data, 8775 MB used, 2392 GB / 2405 GB avail; 107 MB/s wr, 107 op/s 2014-10-30 09:59:56.157098 mon.0 [INF] pgmap v82: 704 pgs: 704 active+clean; 12991 MB data, 8949 MB used, 2392 GB / 2405 GB avail; 105 MB/s wr, 105 op/s 2014-10-30 10:00:01.181859 mon.0 [INF] pgmap v83: 704 pgs: 704 active+clean; 13631 MB data, 9114 MB used, 2392 GB / 2405 GB avail; 104 MB/s wr, 105 op/s Qemu command: time qemu-img convert -f raw -O rbd /home/user/backup_2014_10_27.raw rbd:instances_fast/backup_2014_10_27.21 ceph config: [global] mon_initial_members = $(hostname -s) mon_host = $IP public_network = 10.100.0.0/16 auth_cluster_required = cephx auth_service_required = cephx auth_client_required = cephx osd pool default size = 1 osd pool default min size = 1 osd crush chooseleaf type = 0 ## ssd, journal on same partition [osd] osd data = /var/lib/ceph/osd/ceph-\$id osd journal size = 7000 osd journal = /var/lib/ceph/osd/ceph-\$id-journal/osd-\$id.journal osd crush update on start = false ## osd 1 is on ssd and we put journal on ram [osd.1] osd journal = /dev/shm/osd.1.journal journal dio = false Test performed with dd: sync dd bs=4M count=512 if=/home/user/backup_2014_10_27.raw of=/var/lib/ceph/osd/ceph-1/backup_2014_10_27.raw conv=fdatasync 512+0 records in 512+0 records out 2147483648 bytes (2.1 GB) copied, 16.3971 s, 131 MB/s Thank you, Cristian Falcas _______________________________________________ ceph-users mailing list ceph-users@xxxxxxxxxxxxxx http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com