Re: osd 100% cpu, very slow writes

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On Thu, Oct 30, 2014 at 8:13 AM, Cristian Falcas
<cristi.falcas@xxxxxxxxx> wrote:
> Hello,
>
> I have an one node ceph installation and when trying to import an
> image using qemu, it works fine for some time and after that the osd
> process starts using ~100% of cpu and the number of op/s increases and
> the writes decrease dramatically. The osd process doesn't appear as
> being cpu bound, because there is no cpu maxed out.
>
> How can I debug what is causing this?

Exactly what are you doing to the cluster during this time? How are
you doing the imports? Do they include snapshots?

It kind of sounds like there's a bunch of work being dumped against
the OSD to do something like create or clean up snapshots while it's
doing the copy.
-Greg

>
> I'm reading from a sata disk and to a pool on ssd. The journal is on
> ram and I'm using ceph-0.80.7-0.el7.centos.x86_64
>
> Output from top:
>
> 26344 root      20   0  825900 314516  18852 S 114.0  0.4   5:03.82 ceph-osd
> 27547 root      20   0  927364 153440  13044 S  51.7  0.2    2:50.83 qemu-img
>
> Writes in ceph when the OSD is using a lot of cpu:
>
> 2014-10-30 10:55:08.112259 mon.0 [INF] pgmap v135: 3264 pgs: 3264
> active+clean; 27595 MB data, 16116 MB used, 2385 GB / 2405 GB avail;
> 618 kB/s wr, 1236 op/s
> 2014-10-30 10:55:13.111438 mon.0 [INF] pgmap v136: 3264 pgs: 3264
> active+clean; 27646 MB data, 16174 MB used, 2385 GB / 2405 GB avail;
> 643 kB/s wr, 1286 op/s
> 2014-10-30 10:55:18.110992 mon.0 [INF] pgmap v137: 3264 pgs: 3264
> active+clean; 27693 MB data, 16195 MB used, 2385 GB / 2405 GB avail;
> 632 kB/s wr, 1264 op/s
> 2014-10-30 10:55:23.109454 mon.0 [INF] pgmap v138: 3264 pgs: 3264
> active+clean; 27747 MB data, 16195 MB used, 2385 GB / 2405 GB avail;
> 645 kB/s wr, 1291 op/s
>
>
> Writes in ceph when the OSD is ~10% of cpu:
>
> 2014-10-30 09:59:46.140964 mon.0 [INF] pgmap v80: 704 pgs: 704
> active+clean; 11935 MB data, 8413 MB used, 2392 GB / 2405 GB avail;
> 100536 kB/s wr, 98 op/s
> 2014-10-30 09:59:51.134338 mon.0 [INF] pgmap v81: 704 pgs: 704
> active+clean; 12575 MB data, 8775 MB used, 2392 GB / 2405 GB avail;
> 107 MB/s wr, 107 op/s
> 2014-10-30 09:59:56.157098 mon.0 [INF] pgmap v82: 704 pgs: 704
> active+clean; 12991 MB data, 8949 MB used, 2392 GB / 2405 GB avail;
> 105 MB/s wr, 105 op/s
> 2014-10-30 10:00:01.181859 mon.0 [INF] pgmap v83: 704 pgs: 704
> active+clean; 13631 MB data, 9114 MB used, 2392 GB / 2405 GB avail;
> 104 MB/s wr, 105 op/s
>
> Qemu command:
>
> time qemu-img convert -f raw -O rbd /home/user/backup_2014_10_27.raw
> rbd:instances_fast/backup_2014_10_27.21
>
> ceph config:
>
> [global]
> mon_initial_members = $(hostname -s)
> mon_host = $IP
> public_network = 10.100.0.0/16
> auth_cluster_required = cephx
> auth_service_required = cephx
> auth_client_required = cephx
> osd pool default size = 1
> osd pool default min size = 1
> osd crush chooseleaf type = 0
>
> ## ssd, journal on same partition
> [osd]
> osd data = /var/lib/ceph/osd/ceph-\$id
> osd journal size = 7000
> osd journal = /var/lib/ceph/osd/ceph-\$id-journal/osd-\$id.journal
> osd crush update on start = false
>
> ## osd 1 is on ssd and we put journal on ram
> [osd.1]
> osd journal = /dev/shm/osd.1.journal
> journal dio = false
>
> Test performed with dd:
>
> sync
> dd bs=4M count=512  if=/home/user/backup_2014_10_27.raw
> of=/var/lib/ceph/osd/ceph-1/backup_2014_10_27.raw conv=fdatasync
> 512+0 records in
> 512+0 records out
> 2147483648 bytes (2.1 GB) copied, 16.3971 s, 131 MB/s
>
> Thank you,
> Cristian Falcas
> _______________________________________________
> ceph-users mailing list
> ceph-users@xxxxxxxxxxxxxx
> http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
_______________________________________________
ceph-users mailing list
ceph-users@xxxxxxxxxxxxxx
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com




[Index of Archives]     [Information on CEPH]     [Linux Filesystem Development]     [Ceph Development]     [Ceph Large]     [Linux USB Development]     [Video for Linux]     [Linux Audio Users]     [Yosemite News]     [Linux Kernel]     [Linux SCSI]     [xfs]


  Powered by Linux