Re: osd daemons stuck in D state

Simion Rad <Simion.Rad@xxxxxxxxx> · Mon, 27 Jul 2015 15:18:35 +0000

The server has 128 GB RAM (it also runs KVM virtual machines and they use ~ 95 GB ).

The HBA is  LSI Logic SAS1068 PCI-X Fusion-MPT SAS (kernel module is mptsas version 3.04.20)

I have two HBAs ,but I didn't want to use multipath, so there is only one path / LUN (the array's controllers cannot run active/active anyway).

The value vm.min_free_kbytes is set from puppet to be 2 percent of installed RAM size (to be available to the host's kernel).

I found something unusual in the osd's logfile :

   -95> 2015-07-27 09:53:17.183989 7fd62219a700 10 monclient: _send_mon_message to mon.pshn05 at 10.71.13.5:6789/0

   -94> 2015-07-27 09:53:17.184000 7fd62219a700  1 -- 10.71.13.6:6801/1150636 --> 10.71.13.5:6789/0 -- log(6 entries) v1 -- ?+0 0x19855440 con 0x87002c0

   -93> 2015-07-27 09:53:18.056919 7fd634369700  1 heartbeat_map is_healthy 'OSD::op_tp thread 0x7fd620997700' had timed out after 15

   -92> 2015-07-27 09:53:20.938324 7fd61e993700  5 osd.17 24105 heartbeat: osd_stat(608 GB used, 228 GB avail, 836 GB total, peers []/[] op hist [0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,1,2,2,4,12,18])

   -91> 2015-07-27 09:53:23.057191 7fd634369700  1 heartbeat_map is_healthy 'OSD::op_tp thread 0x7fd620997700' had timed out after 15

   -90> 2015-07-27 09:53:26.238577 7fd61e993700  5 osd.17 24105 heartbeat: osd_stat(608 GB used, 228 GB avail, 836 GB total, peers []/[] op hist [0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,1,2,1,4,13,18])

   -89> 2015-07-27 09:53:27.184101 7fd62219a700 10 monclient: tick

   -88> 2015-07-27 09:53:27.184134 7fd62219a700 10 monclient: _check_auth_rotating have uptodate secrets (they expire after 2015-07-27 09:52:57.184132)

   -87> 2015-07-27 09:53:27.184153 7fd62219a700 10 monclient: renew subs? (now: 2015-07-27 09:53:27.184152; renew after: 2015-07-27 09:55:37.183595) -- no

   -86> 2015-07-27 09:53:28.057309 7fd634369700  1 heartbeat_map is_healthy 'OSD::op_tp thread 0x7fd620997700' had timed out after 15

   -85> 2015-07-27 09:53:29.738763 7fd61e993700  5 osd.17 24105 heartbeat: osd_stat(608 GB used, 228 GB avail, 836 GB total, peers []/[] op hist [0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,3,1,4,13,18])

   -84> 2015-07-27 09:53:32.639022 7fd61e993700  5 osd.17 24105 heartbeat: osd_stat(608 GB used, 228 GB avail, 836 GB total, peers []/[] op hist [0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,3,1,4,13,18])

   -83> 2015-07-27 09:53:33.057450 7fd634369700  1 heartbeat_map is_healthy 'OSD::op_tp thread 0x7fd620997700' had timed out after 15

   -82> 2015-07-27 09:53:37.184306 7fd62219a700 10 monclient: tick

   -81> 2015-07-27 09:53:37.184337 7fd62219a700 10 monclient: _check_auth_rotating have uptodate secrets (they expire after 2015-07-27 09:53:07.184335)

   -80> 2015-07-27 09:53:37.184352 7fd62219a700 10 monclient: renew subs? (now: 2015-07-27 09:53:37.184352; renew after: 2015-07-27 09:55:37.183595) -- no

   -79> 2015-07-27 09:53:37.339282 7fd61e993700  5 osd.17 24105 heartbeat: osd_stat(608 GB used, 228 GB avail, 836 GB total, peers []/[] op hist [0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,2,2,4,13,18])

   -78> 2015-07-27 09:53:38.057573 7fd634369700  1 heartbeat_map is_healthy 'OSD::op_tp thread 0x7fd620997700' had timed out after 15

   -77> 2015-07-27 09:53:42.639514 7fd61e993700  5 osd.17 24105 heartbeat: osd_stat(608 GB used, 228 GB avail, 836 GB total, peers []/[] op hist [0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,2,1,5,13,18])

   -76> 2015-07-27 09:53:43.057726 7fd634369700  1 heartbeat_map is_healthy 'OSD::op_tp thread 0x7fd620997700' had timed out after 15

   -75> 2015-07-27 09:53:46.739747 7fd61e993700  5 osd.17 24105 heartbeat: osd_stat(608 GB used, 228 GB avail, 836 GB total, peers []/[] op hist [0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,1,2,5,13,18])

   -74> 2015-07-27 09:53:47.184469 7fd62219a700 10 monclient: tick

   -73> 2015-07-27 09:53:47.184498 7fd62219a700 10 monclient: _check_auth_rotating have uptodate secrets (they expire after 2015-07-27 09:53:17.184497)

   -72> 2015-07-27 09:53:47.184516 7fd62219a700 10 monclient: renew subs? (now: 2015-07-27 09:53:47.184515; renew after: 2015-07-27 09:55:37.183595) -- no

   -71> 2015-07-27 09:53:47.240011 7fd61e993700  5 osd.17 24105 heartbeat: osd_stat(608 GB used, 228 GB avail, 836 GB total, peers []/[] op hist [0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,1,2,5,13,18])

   -70> 2015-07-27 09:53:47.740229 7fd61e993700  5 osd.17 24105 heartbeat: osd_stat(608 GB used, 228 GB avail, 836 GB total, peers []/[] op hist [0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,1,2,5,13,18])

   -69> 2015-07-27 09:53:48.057842 7fd634369700  1 heartbeat_map is_healthy 'OSD::op_tp thread 0x7fd620997700' had timed out after 15

   -68> 2015-07-27 09:53:51.840439 7fd61e993700  5 osd.17 24105 heartbeat: osd_stat(608 GB used, 228 GB avail, 836 GB total, peers []/[] op hist [0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,1,2,5,13,18])

   -67> 2015-07-27 09:53:53.057968 7fd634369700  1 heartbeat_map is_healthy 'OSD::op_tp thread 0x7fd620997700' had timed out after 15

   -66> 2015-07-27 09:53:54.740614 7fd61e993700  5 osd.17 24105 heartbeat: osd_stat(608 GB used, 228 GB avail, 836 GB total, peers []/[] op hist [0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,1,2,5,13,18])

   -65> 2015-07-27 09:53:57.040854 7fd61e993700  5 osd.17 24105 heartbeat: osd_stat(608 GB used, 228 GB avail, 836 GB total, peers []/[] op hist [0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,1,2,5,13,18])

   -64> 2015-07-27 09:53:57.184668 7fd62219a700 10 monclient: tick

   -63> 2015-07-27 09:53:57.184691 7fd62219a700 10 monclient: _check_auth_rotating have uptodate secrets (they expire after 2015-07-27 09:53:27.184689)

   -62> 2015-07-27 09:53:57.184701 7fd62219a700 10 monclient: renew subs? (now: 2015-07-27 09:53:57.184701; renew after: 2015-07-27 09:55:37.183595) -- no

   -61> 2015-07-27 09:53:58.058079 7fd634369700  1 heartbeat_map is_healthy 'OSD::op_tp thread 0x7fd620997700' had timed out after 15

   -60> 2015-07-27 09:54:01.741090 7fd61e993700  5 osd.17 24105 heartbeat: osd_stat(608 GB used, 228 GB avail, 836 GB total, peers []/[] op hist [0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,3,5,12,19])

   -59> 2015-07-27 09:54:03.058197 7fd634369700  1 heartbeat_map is_healthy 'OSD::op_tp thread 0x7fd620997700' had timed out after 15

   -58> 2015-07-27 09:54:05.241345 7fd61e993700  5 osd.17 24105 heartbeat: osd_stat(608 GB used, 228 GB avail, 836 GB total, peers []/[] op hist [0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,3,5,12,19])

   -57> 2015-07-27 09:54:07.184789 7fd62219a700 10 monclient: tick

   -56> 2015-07-27 09:54:07.184819 7fd62219a700 10 monclient: _check_auth_rotating have uptodate secrets (they expire after 2015-07-27 09:53:37.184816)

   -55> 2015-07-27 09:54:07.184833 7fd62219a700 10 monclient: renew subs? (now: 2015-07-27 09:54:07.184833; renew after: 2015-07-27 09:55:37.183595) -- no

   -54> 2015-07-27 09:54:08.058341 7fd634369700  1 heartbeat_map is_healthy 'OSD::op_tp thread 0x7fd620997700' had timed out after 15

   -53> 2015-07-27 09:54:10.541580 7fd61e993700  5 osd.17 24105 heartbeat: osd_stat(608 GB used, 228 GB avail, 836 GB total, peers []/[] op hist [0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,3,5,12,19])

   -52> 2015-07-27 09:54:11.641813 7fd61e993700  5 osd.17 24105 heartbeat: osd_stat(608 GB used, 228 GB avail, 836 GB total, peers []/[] op hist [0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,3,5,12,19])

   -51> 2015-07-27 09:54:12.142027 7fd61e993700  5 osd.17 24105 heartbeat: osd_stat(608 GB used, 228 GB avail, 836 GB total, peers []/[] op hist [0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,3,5,12,19])

   -50> 2015-07-27 09:54:13.058501 7fd634369700  1 heartbeat_map is_healthy 'OSD::op_tp thread 0x7fd620997700' had timed out after 15

   -49> 2015-07-27 09:54:17.184956 7fd62219a700 10 monclient: tick

   -48> 2015-07-27 09:54:17.184998 7fd62219a700 10 monclient: _check_auth_rotating have uptodate secrets (they expire after 2015-07-27 09:53:47.184996)

   -47> 2015-07-27 09:54:17.185012 7fd62219a700 10 monclient: renew subs? (now: 2015-07-27 09:54:17.185012; renew after: 2015-07-27 09:55:37.183595) -- no

   -46> 2015-07-27 09:54:18.042246 7fd61e993700  5 osd.17 24105 heartbeat: osd_stat(608 GB used, 228 GB avail, 836 GB total, peers []/[] op hist [0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,3,5,12,19])

   -45> 2015-07-27 09:54:18.058651 7fd634369700  1 heartbeat_map is_healthy 'OSD::op_tp thread 0x7fd620997700' had timed out after 15

   -44> 2015-07-27 09:54:20.342491 7fd61e993700  5 osd.17 24105 heartbeat: osd_stat(608 GB used, 228 GB avail, 836 GB total, peers []/[] op hist [0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,3,5,12,19])

   -43> 2015-07-27 09:54:20.842714 7fd61e993700  5 osd.17 24105 heartbeat: osd_stat(608 GB used, 228 GB avail, 836 GB total, peers []/[] op hist [0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,3,4,13,19])

   -42> 2015-07-27 09:54:22.542947 7fd61e993700  5 osd.17 24105 heartbeat: osd_stat(608 GB used, 228 GB avail, 836 GB total, peers []/[] op hist [0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,3,4,13,19])

   -41> 2015-07-27 09:54:23.058771 7fd634369700  1 heartbeat_map is_healthy 'OSD::op_tp thread 0x7fd620997700' had timed out after 15

   -40> 2015-07-27 09:54:26.643171 7fd61e993700  5 osd.17 24105 heartbeat: osd_stat(608 GB used, 228 GB avail, 836 GB total, peers []/[] op hist [0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,3,4,13,19])

   -39> 2015-07-27 09:54:27.185160 7fd62219a700 10 monclient: tick

   -38> 2015-07-27 09:54:27.185192 7fd62219a700 10 monclient: _check_auth_rotating have uptodate secrets (they expire after 2015-07-27 09:53:57.185190)

   -37> 2015-07-27 09:54:27.185207 7fd62219a700 10 monclient: renew subs? (now: 2015-07-27 09:54:27.185206; renew after: 2015-07-27 09:55:37.183595) -- no

   -36> 2015-07-27 09:54:28.058903 7fd634369700  1 heartbeat_map is_healthy 'OSD::op_tp thread 0x7fd620997700' had timed out after 15

   -35> 2015-07-27 09:54:29.543389 7fd61e993700  5 osd.17 24105 heartbeat: osd_stat(608 GB used, 228 GB avail, 836 GB total, peers []/[] op hist [0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,3,4,12,20])

   -34> 2015-07-27 09:54:31.843630 7fd61e993700  5 osd.17 24105 heartbeat: osd_stat(608 GB used, 228 GB avail, 836 GB total, peers []/[] op hist [0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,3,4,12,20])

   -33> 2015-07-27 09:54:33.059016 7fd634369700  1 heartbeat_map is_healthy 'OSD::op_tp thread 0x7fd620997700' had timed out after 15

   -32> 2015-07-27 09:54:37.143863 7fd61e993700  5 osd.17 24105 heartbeat: osd_stat(608 GB used, 228 GB avail, 836 GB total, peers []/[] op hist [0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,3,3,13,20])

   -31> 2015-07-27 09:54:37.185293 7fd62219a700 10 monclient: tick

   -30> 2015-07-27 09:54:37.185320 7fd62219a700 10 monclient: _check_auth_rotating have uptodate secrets (they expire after 2015-07-27 09:54:07.185318)

   -29> 2015-07-27 09:54:37.185335 7fd62219a700 10 monclient: renew subs? (now: 2015-07-27 09:54:37.185334; renew after: 2015-07-27 09:55:37.183595) -- no

   -28> 2015-07-27 09:54:38.059155 7fd634369700  1 heartbeat_map is_healthy 'OSD::op_tp thread 0x7fd620997700' had timed out after 15

   -27> 2015-07-27 09:54:41.244092 7fd61e993700  5 osd.17 24105 heartbeat: osd_stat(608 GB used, 228 GB avail, 836 GB total, peers []/[] op hist [0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,2,4,13,20])

   -26> 2015-07-27 09:54:43.059269 7fd634369700  1 heartbeat_map is_healthy 'OSD::op_tp thread 0x7fd620997700' had timed out after 15

   -25> 2015-07-27 09:54:45.944307 7fd61e993700  5 osd.17 24105 heartbeat: osd_stat(608 GB used, 228 GB avail, 836 GB total, peers []/[] op hist [0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,2,3,14,20])

   -24> 2015-07-27 09:54:46.444561 7fd61e993700  5 osd.17 24105 heartbeat: osd_stat(608 GB used, 228 GB avail, 836 GB total, peers []/[] op hist [0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,2,3,14,20])

   -23> 2015-07-27 09:54:47.185474 7fd62219a700 10 monclient: tick

   -22> 2015-07-27 09:54:47.185505 7fd62219a700 10 monclient: _check_auth_rotating have uptodate secrets (they expire after 2015-07-27 09:54:17.185504)

   -21> 2015-07-27 09:54:47.185520 7fd62219a700 10 monclient: renew subs? (now: 2015-07-27 09:54:47.185520; renew after: 2015-07-27 09:55:37.183595) -- no

   -20> 2015-07-27 09:54:48.059419 7fd634369700  1 heartbeat_map is_healthy 'OSD::op_tp thread 0x7fd620997700' had timed out after 15

   -19> 2015-07-27 09:54:51.744775 7fd61e993700  5 osd.17 24105 heartbeat: osd_stat(608 GB used, 228 GB avail, 836 GB total, peers []/[] op hist [0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,1,4,14,20])

   -18> 2015-07-27 09:54:52.844997 7fd61e993700  5 osd.17 24105 heartbeat: osd_stat(608 GB used, 228 GB avail, 836 GB total, peers []/[] op hist [0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,1,4,14,20])

   -17> 2015-07-27 09:54:53.059598 7fd634369700  1 heartbeat_map is_healthy 'OSD::op_tp thread 0x7fd620997700' had timed out after 15

   -16> 2015-07-27 09:54:57.185641 7fd62219a700 10 monclient: tick

   -15> 2015-07-27 09:54:57.185671 7fd62219a700 10 monclient: _check_auth_rotating have uptodate secrets (they expire after 2015-07-27 09:54:27.185669)

   -14> 2015-07-27 09:54:57.185686 7fd62219a700 10 monclient: renew subs? (now: 2015-07-27 09:54:57.185686; renew after: 2015-07-27 09:55:37.183595) -- no

   -13> 2015-07-27 09:54:58.059724 7fd634369700  1 heartbeat_map is_healthy 'OSD::op_tp thread 0x7fd620997700' had timed out after 15

   -12> 2015-07-27 09:54:58.745245 7fd61e993700  5 osd.17 24105 heartbeat: osd_stat(608 GB used, 228 GB avail, 836 GB total, peers []/[] op hist [0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,1,4,13,21])

   -11> 2015-07-27 09:55:03.059839 7fd634369700  1 heartbeat_map is_healthy 'OSD::op_tp thread 0x7fd620997700' had timed out after 15

   -10> 2015-07-27 09:55:04.045479 7fd61e993700  5 osd.17 24105 heartbeat: osd_stat(608 GB used, 228 GB avail, 836 GB total, peers []/[] op hist [0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,1,4,13,21])

    -9> 2015-07-27 09:55:05.145700 7fd61e993700  5 osd.17 24105 heartbeat: osd_stat(608 GB used, 228 GB avail, 836 GB total, peers []/[] op hist [0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,5,13,21])

    -8> 2015-07-27 09:55:07.185779 7fd62219a700 10 monclient: tick

    -7> 2015-07-27 09:55:07.185812 7fd62219a700 10 monclient: _check_auth_rotating have uptodate secrets (they expire after 2015-07-27 09:54:37.185810)

    -6> 2015-07-27 09:55:07.185839 7fd62219a700 10 monclient: renew subs? (now: 2015-07-27 09:55:07.185839; renew after: 2015-07-27 09:55:37.183595) -- no

    -5> 2015-07-27 09:55:08.059975 7fd634369700  1 heartbeat_map is_healthy 'OSD::op_tp thread 0x7fd620997700' had timed out after 15

    -4> 2015-07-27 09:55:09.245933 7fd61e993700  5 osd.17 24105 heartbeat: osd_stat(608 GB used, 228 GB avail, 836 GB total, peers []/[] op hist [0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,5,12,22])

    -3> 2015-07-27 09:55:12.146208 7fd61e993700  5 osd.17 24105 heartbeat: osd_stat(608 GB used, 228 GB avail, 836 GB total, peers []/[] op hist [0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,5,12,22])

    -2> 2015-07-27 09:55:13.060094 7fd634369700  1 heartbeat_map is_healthy 'OSD::op_tp thread 0x7fd620997700' had timed out after 15

    -1> 2015-07-27 09:55:13.060134 7fd634369700  1 heartbeat_map is_healthy 'OSD::op_tp thread 0x7fd620997700' had suicide timed out after 150

     0> 2015-07-27 09:55:13.062432 7fd634369700 -1 common/HeartbeatMap.cc: In function 'bool ceph::HeartbeatMap::_check(ceph::heartbeat_handle_d*, const char*, time_t)' thread 7fd634369700 time 2015-07-27 09:55:13.060158

common/HeartbeatMap.cc: 79: FAILED assert(0 == "hit suicide timeout")

 ceph version 0.80.10 (ea6c958c38df1216bf95c927f143d8b13c4a9e70)

 1: (ceph::HeartbeatMap::_check(ceph::heartbeat_handle_d*, char const*, long)+0x2a9) [0x9add79]

 2: (ceph::HeartbeatMap::is_healthy()+0xc6) [0x9ae5f6]

 3: (ceph::HeartbeatMap::check_touch_file()+0x17) [0x9aec57]

 4: (CephContextServiceThread::entry()+0x154) [0xa7b064]

 5: (()+0x8182) [0x7fd637aeb182]

 6: (clone()+0x6d) [0x7fd63625e47d]

 NOTE: a copy of the executable, or `objdump -rdS <executable>` is needed to interpret this.

--- logging levels ---

   0/ 5 none

   0/ 1 lockdep

   0/ 1 context

   1/ 1 crush

   1/ 5 mds

   1/ 5 mds_balancer

   1/ 5 mds_locker

   1/ 5 mds_log

   1/ 5 mds_log_expire

   1/ 5 mds_migrator

   0/ 1 buffer

   0/ 1 timer

   0/ 1 filer

   0/ 1 striper

   0/ 1 objecter

   0/ 5 rados

   0/ 5 rbd

   0/ 5 journaler

   0/ 5 objectcacher

   0/ 5 client

   0/ 5 osd

   0/ 5 optracker

   0/ 5 objclass

   1/ 3 filestore

   1/ 3 keyvaluestore

   1/ 3 journal

   0/ 5 ms

   1/ 5 mon

   0/10 monc

   1/ 5 paxos

   0/ 5 tp

   1/ 5 auth

   1/ 5 crypto

   1/ 1 finisher

   1/ 5 heartbeatmap

   1/ 5 perfcounter

   1/ 5 rgw

   1/10 civetweb

   1/ 5 javaclient

   1/ 5 asok

   1/ 1 throttle

  -2/-2 (syslog threshold)

  -1/-1 (stderr threshold)

  max_recent     10000

  max_new         1000

  log_file /var/log/ceph/ceph-osd.17.log

--- end dump of recent events ---

2015-07-27 09:55:13.217007 7fd634369700 -1 *** Caught signal (Aborted) **

 in thread 7fd634369700

Right now I've removed (again) the stuck osds from crushmap , but I'll try to re-add them in order to reproduce the issue and then run the commands you wrote.

I want to test with a new osd setup anyway: 1 drive configured as RAID0 LUN (the only option on MD3000). I want to see if there is a lower overhead on the MD3000's controller.

Thanks,

Simion Rad.

From: Jan Schermer [jan@xxxxxxxxxxx]

Sent: Monday, July 27, 2015 15:38

To: Simion Rad

Cc: ceph-users@xxxxxxxx

Subject: Re:  osd daemons stuck in D state

When those processes become blocked are the drives busy or idle?

Can you post the output from
"ps -awexo pid,tt,user,fname,tmout,f,wchan” on those processes when that happens?

My guess would be they really are waiting for the disk array for some reason - can you check if you can read/write to the OSD partitions when this happens? iostat output?
And not sure what your HBA is but sometimes very bad things happen when you saturate the drives and cache completely, like lowering the LUN queue depth to 1 (which completely kills IO) or even dropping commands (but that would likely show in dmesg).
 This situation is often masked by driver completely. And ceph is really good at saturating drives.

Also how much memory does your machine have? vm.min_free_kbytes = 2640322 looks pretty high to me and it could block anything anytime if kswapd kicks in and starts cleaning pages. (But the whole system would be unusable so you’d likely notice
 that)

Jan

On 27 Jul 2015, at 13:24, Simion Rad <Simion.Rad@xxxxxxxxx> wrote:

Hello all, 

When I try to add more than one osds to a host and the backfilling process starts , all the osd daemons except one of them become stuck in D state. When this happends they are shown as out and down (when running ceph osd tree).

The only way I can kill the processes is to 

remove the osds  from crushmap , then run kill -9 on them and then wait for a couple of minutes.

There are no exception messages in osd logs and the dmesg looks ok too (nothing out of the ordinary).

I run ceph firefly 0.80.10 on Ubuntu 14.04 (linux 3.13).

The osds are running on RAID0 LUNs (2 drives for every diskgroup) created on a Dell MD3000 array with Hitachi hard drives (450 GB , 15K RPM).

The issue happens even with 2 or 3 osds active on the host.

I have only 1 Gb/s link to the host. Could the network bandwidth be the issue ? 

The settings from sysctl.conf:

net.core.netdev_max_backlog = 250000

net.core.optmem_max = 16777216

net.core.rmem_default = 16777216

net.core.wmem_default = 16777216

net.core.rmem_max = 16777216

net.core.wmem_max = 16777216

net.ipv4.tcp_mem = 16777216 16777216 16777216

net.ipv4.tcp_rmem = 4096 87380 16777216

net.ipv4.tcp_wmem = 4096 87380 16777216

net.ipv4.tcp_low_latency = 1

net.ipv4.tcp_sack = 0

net.ipv4.tcp_timestamps = 0

net.ipv4.conf.default.rp_filter = 0

net.ipv4.conf.all.rp_filter = 0

net.ipv4.ip_forward = 1

net.ipv4.tcp_tw_recycle = 0

net.ipv4.tcp_tw_reuse = 0

net.ipv4.tcp_window_scaling = 0

net.ipv4.route.flush=1

vm.min_free_kbytes = 2640322

vm.swappiness = 0

vm.overcommit_memory = 1

vm.oom_kill_allocating_task = 0

vm.dirty_expire_centisecs = 360000

vm.dirty_writeback_centisecs = 360000

kernel.pid_max = 4194303

fs.file-max = 16815744

vm.dirty_ratio = 99

vm.dirty_background_ratio = 99

vm.vfs_cache_pressure = 100

Thanks, 

Simion Rad.

_______________________________________________

ceph-users
 mailing list

ceph-users@xxxxxxxxxxxxxx

http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com

_______________________________________________
ceph-users mailing list
ceph-users@xxxxxxxxxxxxxx
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com