On 5/7/14 15:33 , Dimitri Maziuk wrote: > On 05/07/2014 04:11 PM, Craig Lewis wrote: >> On 5/7/14 13:40 , Sergey Malinin wrote: >>> Check dmesg and SMART data on both nodes. This behaviour is similar to >>> failing hdd. >>> >>> >> It does sound like a failing disk... but there's nothing in dmesg, and >> smartmontools hasn't emailed me about a failing disk. The same thing is >> happening to more than 50% of my OSDs, in both nodes. > check 'iostat -dmx 5 5' (or some other numbers) -- if you see 100%+ disk > utilization, that could be the dying one. > A new OSD, osd.10, has started doing this. I currently have all of the previously advised params (osd_max_backfill = 1, osd_recovery_op_priority = 1, osd_recovery_max_active = 1) active. I stopped the daemon, and started watching iostat root at ceph1c:~# iostat sde -dmx 5 Device: rrqm/s wrqm/s r/s w/s rMB/s wMB/s avgrq-sz avgqu-sz await r_await w_await svctm %util sde 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 sde 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 # I started the osd daemon during this next sample sde 0.00 0.00 7.60 33.20 0.81 0.92 86.55 0.06 1.57 3.58 1.11 0.71 2.88 sde 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 sde 0.00 0.00 2.20 336.00 0.01 1.46 8.91 0.07 0.21 17.82 0.09 0.20 6.88 # During this next sample, the ceph-osd daemon started consuming exactly 100% CPU sde 0.00 0.00 0.40 8.40 0.00 0.36 84.18 0.02 2.00 26.00 0.86 1.18 1.04 sde 0.00 0.00 2.20 336.00 0.01 1.46 8.91 0.07 0.21 17.82 0.09 0.20 6.88 sde 0.00 0.00 0.40 8.40 0.00 0.36 84.18 0.02 2.00 26.00 0.86 1.18 1.04 sde 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 sde 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 Device: rrqm/s wrqm/s r/s w/s rMB/s wMB/s avgrq-sz avgqu-sz await r_await w_await svctm %util sde 0.00 0.00 0.00 18.00 0.00 0.28 31.73 0.02 1.11 0.00 1.11 0.04 0.08 sde 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 sde 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 <snip repetitive rows> sde 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 sde 0.00 0.00 1.20 0.00 0.08 0.00 132.00 0.02 20.67 20.67 0.00 20.67 2.48 sde 0.00 0.00 0.40 0.00 0.03 0.00 128.00 0.02 46.00 46.00 0.00 46.00 1.84 sde 0.00 0.00 0.20 0.00 0.01 0.00 128.00 0.01 44.00 44.00 0.00 44.00 0.88 sde 0.00 0.00 5.00 15.60 0.41 0.82 121.94 0.03 1.24 4.64 0.15 1.17 2.40 sde 0.00 0.00 0.00 27.40 0.00 0.17 12.44 0.49 17.96 0.00 17.96 0.53 1.44 # The suicide timer hits in this sample or the next, and the daemon restarts sde 0.00 0.00 113.60 261.20 2.31 1.00 18.08 1.17 3.12 10.15 0.06 1.79 66.96 Device: rrqm/s wrqm/s r/s w/s rMB/s wMB/s avgrq-sz avgqu-sz await r_await w_await svctm %util sde 0.00 0.00 176.20 134.60 3.15 1.31 29.40 1.79 5.77 10.12 0.08 3.16 98.16 sde 0.00 0.00 184.40 6.80 3.05 0.07 33.46 1.94 10.15 10.53 0.00 5.10 97.52 sde 0.00 0.00 202.20 28.80 3.60 0.26 34.26 2.06 8.92 10.18 0.06 4.09 94.40 sde 0.00 0.00 193.20 20.80 2.90 0.28 30.43 2.02 9.44 10.43 0.15 4.58 97.92 ^C During the first cycle, there was almost no data being read or written. During the second cycle, I see a what looks like a normal recovery operation. But the daemon still hits 100% CPU, and gets kicked out for being unresponsive. The third and fourth cycles (not shown) look like the first cycle. So this is not a failing disk. 0% disk util and 100% CPU util means the code is stuck in some sort of fast loop that doesn't need external input. It could be some legit task that it's not able to complete before being killed, or it could be a hung lock. I'm going to try setting noout and nodown, and see if that helps. I'm trying to test if it's some start up operation (leveldb compaction or something) that can't complete before the other OSDs mark it down. I'll give that an hour to see what happens. If it's still flapping after that, I'll unset nodown, and disable the daemon for the time being. -- *Craig Lewis* Senior Systems Engineer Office +1.714.602.1309 Email clewis at centraldesktop.com <mailto:clewis at centraldesktop.com> *Central Desktop. Work together in ways you never thought possible.* Connect with us Website <http://www.centraldesktop.com/> | Twitter <http://www.twitter.com/centraldesktop> | Facebook <http://www.facebook.com/CentralDesktop> | LinkedIn <http://www.linkedin.com/groups?gid=147417> | Blog <http://cdblog.centraldesktop.com/> -------------- next part -------------- An HTML attachment was scrubbed... URL: <http://lists.ceph.com/pipermail/ceph-users-ceph.com/attachments/20140510/cbc26fe9/attachment.htm>