Re: MDS debugging

Martin B Nielsen <martin@xxxxxxxxxxx> · Mon, 31 Mar 2014 15:55:24 +0200

Hi,

I can see you're running mon, mds and osd on the same server.

Also, from a quick glance you're using around 13GB resident memory.

If you only have 16GB in your system I'm guessing you'll be swapping about now (or close). How much mem does the system hold?

Also, how busy are the disks? Or is it primarily cpu-bound? Do you have many processes waiting for run time or high interrupt count?

/Martin

On Mon, Mar 31, 2014 at 1:49 PM, Kenneth Waegeman <Kenneth.Waegeman@xxxxxxxx> wrote:

Hi all,

Before the weekend we started some copying tests over ceph-fuse. Initially, this went ok. But then the performance started dropping gradually. Things are going very slow now:

2014-03-31 13:36:37.047423 mon.0 [INF] pgmap v265871: 1300 pgs: 1300 active+clean; 19872 GB data, 59953 GB used, 74117 GB / 130 TB avail; 44747 kB/s rd, 216 kB/s wr, 10 op/s

2014-03-31 13:36:38.049286 mon.0 [INF] pgmap v265872: 1300 pgs: 1300 active+clean; 19872 GB data, 59953 GB used, 74117 GB / 130 TB avail; 4069 B/s rd, 363 kB/s wr, 24 op/s

2014-03-31 13:36:39.057680 mon.0 [INF] pgmap v265873: 1300 pgs: 1300 active+clean; 19872 GB data, 59953 GB used, 74117 GB / 130 TB avail; 5092 B/s rd, 151 kB/s wr, 22 op/s

2014-03-31 13:36:40.075718 mon.0 [INF] pgmap v265874: 1300 pgs: 1300 active+clean; 19872 GB data, 59953 GB used, 74117 GB / 130 TB avail; 25961 B/s rd, 1527 B/s wr, 10 op/s

2014-03-31 13:36:41.087764 mon.0 [INF] pgmap v265875: 1300 pgs: 1300 active+clean; 19872 GB data, 59953 GB used, 74117 GB / 130 TB avail; 71574 kB/s rd, 4564 B/s wr, 17 op/s

2014-03-31 13:36:42.109200 mon.0 [INF] pgmap v265876: 1300 pgs: 1300 active+clean; 19872 GB data, 59953 GB used, 74117 GB / 130 TB avail; 71238 kB/s rd, 3534 B/s wr, 9 op/s

2014-03-31 13:36:43.128113 mon.0 [INF] pgmap v265877: 1300 pgs: 1300 active+clean; 19872 GB data, 59953 GB used, 74117 GB / 130 TB avail; 4022 B/s rd, 116 kB/s wr, 24 op/s

2014-03-31 13:36:44.143382 mon.0 [INF] pgmap v265878: 1300 pgs: 1300 active+clean; 19872 GB data, 59953 GB used, 74117 GB / 130 TB avail; 8030 B/s rd, 117 kB/s wr, 29 op/s

2014-03-31 13:36:45.160405 mon.0 [INF] pgmap v265879: 1300 pgs: 1300 active+clean; 19872 GB data, 59953 GB used, 74117 GB / 130 TB avail; 7049 B/s rd, 4531 B/s wr, 9 op/s

ceph-mds seems very busy, and also only one osd!

PID USER      PR  NI  VIRT  RES  SHR S %CPU %MEM    TIME+  COMMAND

54279 root      20   0 8561m 7.5g 4408 S 105.6 23.8   3202:05 ceph-mds

50242 root      20   0 1378m 373m 6452 S  0.7  1.2 523:38.77 ceph-osd

49446 root      18  -2 10644  356  320 S  0.0  0.0   0:00.00 udevd

49444 root      18  -2 10644  428  320 S  0.0  0.0   0:00.00 udevd

49319 root      20   0 1444m 405m 5684 S  0.0  1.3 513:41.13 ceph-osd

48452 root      20   0 1365m 364m 5636 S  0.0  1.1 551:52.31 ceph-osd

47641 root      20   0 1567m 388m 5880 S  0.0  1.2 754:50.60 ceph-osd

46811 root      20   0 1441m 393m 8256 S  0.0  1.2 603:11.26 ceph-osd

46028 root      20   0 1594m 398m 6156 S  0.0  1.2 657:22.16 ceph-osd

45275 root      20   0 1545m 510m 9920 S 18.9  1.6 943:11.99 ceph-osd

44532 root      20   0 1509m 395m 7380 S  0.0  1.2 665:30.66 ceph-osd

43835 root      20   0 1397m 384m 8292 S  0.0  1.2 466:35.47 ceph-osd

43146 root      20   0 1412m 393m 5884 S  0.0  1.2 506:42.07 ceph-osd

42496 root      20   0 1389m 364m 5292 S  0.0  1.1 522:37.70 ceph-osd

41863 root      20   0 1504m 393m 5864 S  0.0  1.2 462:58.11 ceph-osd

39035 root      20   0  918m 694m 3396 S  3.3  2.2  55:53.59 ceph-mon

Does this look familiar to someone?

How can we debug this further?

I already have set the debug level of mds to 5. There are a lot of 'lookup' entries, but I can't see any reported warnings or errors.

Thanks!

Kind regards,

Kenneth

_______________________________________________

ceph-users mailing list

ceph-users@xxxxxxxxxxxxxx

http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com

_______________________________________________
ceph-users mailing list
ceph-users@xxxxxxxxxxxxxx
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com