MDS debugging

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



Hi all,

Before the weekend we started some copying tests over ceph-fuse. Initially, this went ok. But then the performance started dropping gradually. Things are going very slow now:

2014-03-31 13:36:37.047423 mon.0 [INF] pgmap v265871: 1300 pgs: 1300 active+clean; 19872 GB data, 59953 GB used, 74117 GB / 130 TB avail; 44747 kB/s rd, 216 kB/s wr, 10 op/s 2014-03-31 13:36:38.049286 mon.0 [INF] pgmap v265872: 1300 pgs: 1300 active+clean; 19872 GB data, 59953 GB used, 74117 GB / 130 TB avail; 4069 B/s rd, 363 kB/s wr, 24 op/s 2014-03-31 13:36:39.057680 mon.0 [INF] pgmap v265873: 1300 pgs: 1300 active+clean; 19872 GB data, 59953 GB used, 74117 GB / 130 TB avail; 5092 B/s rd, 151 kB/s wr, 22 op/s 2014-03-31 13:36:40.075718 mon.0 [INF] pgmap v265874: 1300 pgs: 1300 active+clean; 19872 GB data, 59953 GB used, 74117 GB / 130 TB avail; 25961 B/s rd, 1527 B/s wr, 10 op/s 2014-03-31 13:36:41.087764 mon.0 [INF] pgmap v265875: 1300 pgs: 1300 active+clean; 19872 GB data, 59953 GB used, 74117 GB / 130 TB avail; 71574 kB/s rd, 4564 B/s wr, 17 op/s 2014-03-31 13:36:42.109200 mon.0 [INF] pgmap v265876: 1300 pgs: 1300 active+clean; 19872 GB data, 59953 GB used, 74117 GB / 130 TB avail; 71238 kB/s rd, 3534 B/s wr, 9 op/s 2014-03-31 13:36:43.128113 mon.0 [INF] pgmap v265877: 1300 pgs: 1300 active+clean; 19872 GB data, 59953 GB used, 74117 GB / 130 TB avail; 4022 B/s rd, 116 kB/s wr, 24 op/s 2014-03-31 13:36:44.143382 mon.0 [INF] pgmap v265878: 1300 pgs: 1300 active+clean; 19872 GB data, 59953 GB used, 74117 GB / 130 TB avail; 8030 B/s rd, 117 kB/s wr, 29 op/s 2014-03-31 13:36:45.160405 mon.0 [INF] pgmap v265879: 1300 pgs: 1300 active+clean; 19872 GB data, 59953 GB used, 74117 GB / 130 TB avail; 7049 B/s rd, 4531 B/s wr, 9 op/s


ceph-mds seems very busy, and also only one osd!

PID USER      PR  NI  VIRT  RES  SHR S %CPU %MEM    TIME+  COMMAND
54279 root      20   0 8561m 7.5g 4408 S 105.6 23.8   3202:05 ceph-mds
50242 root      20   0 1378m 373m 6452 S  0.7  1.2 523:38.77 ceph-osd
49446 root      18  -2 10644  356  320 S  0.0  0.0   0:00.00 udevd
49444 root      18  -2 10644  428  320 S  0.0  0.0   0:00.00 udevd
49319 root      20   0 1444m 405m 5684 S  0.0  1.3 513:41.13 ceph-osd
48452 root      20   0 1365m 364m 5636 S  0.0  1.1 551:52.31 ceph-osd
47641 root      20   0 1567m 388m 5880 S  0.0  1.2 754:50.60 ceph-osd
46811 root      20   0 1441m 393m 8256 S  0.0  1.2 603:11.26 ceph-osd
46028 root      20   0 1594m 398m 6156 S  0.0  1.2 657:22.16 ceph-osd
45275 root      20   0 1545m 510m 9920 S 18.9  1.6 943:11.99 ceph-osd
44532 root      20   0 1509m 395m 7380 S  0.0  1.2 665:30.66 ceph-osd
43835 root      20   0 1397m 384m 8292 S  0.0  1.2 466:35.47 ceph-osd
43146 root      20   0 1412m 393m 5884 S  0.0  1.2 506:42.07 ceph-osd
42496 root      20   0 1389m 364m 5292 S  0.0  1.1 522:37.70 ceph-osd
41863 root      20   0 1504m 393m 5864 S  0.0  1.2 462:58.11 ceph-osd
39035 root      20   0  918m 694m 3396 S  3.3  2.2  55:53.59 ceph-mon

Does this look familiar to someone?

How can we debug this further?
I already have set the debug level of mds to 5. There are a lot of 'lookup' entries, but I can't see any reported warnings or errors.

Thanks!

Kind regards,
Kenneth

_______________________________________________
ceph-users mailing list
ceph-users@xxxxxxxxxxxxxx
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com




[Index of Archives]     [Information on CEPH]     [Linux Filesystem Development]     [Ceph Development]     [Ceph Large]     [Linux USB Development]     [Video for Linux]     [Linux Audio Users]     [Yosemite News]     [Linux Kernel]     [Linux SCSI]     [xfs]


  Powered by Linux