70+ OSD are DOWN and not coming up

clewis@xxxxxxxxxxxxxxxxxx (Craig Lewis) · Thu, 22 May 2014 00:26:52 -0700

On 5/21/14 21:15 , Sage Weil wrote:
> On Wed, 21 May 2014, Craig Lewis wrote:
>> If you do this over IRC, can you please post a summary to the mailling
>> list?
>>
>> I believe I'm having this issue as well.
> In the other case, we found that some of the OSDs were behind processing
> maps (by several thousand epochs).  The trick here to give them a chance
> to catch up is
>
>   ceph osd set noup
>   ceph osd set nodown
>   ceph osd set noout
>
> and wait for them to stop spinning on the CPU.  You can check which map
> each OSD is on with
>
>   ceph daemon osd.NNN status
>
> to see which epoch they are on and compare that to
>
>   ceph osd stat
>
> Once they are within 100 or less epochs,
>
>   ceph osd unset noup
>
> and let them all start up.
>
> We haven't determined whether the original problem was caused by this or
> the other way around; we'll see once they are all caught up.
>
> sage

I was seeing the CPU spinning too, so I think it is the same issue. 
Thanks for the explanation!  I've been pulling my hair out for weeks.

I can give you a data point for the "how".  My problems started with a 
kswapd problem on 12.04.04 (kernel 3.5.0-46-generic 
#70~precise1-Ubuntu).  kswapd was consuming 100% CPU, and it was 
blocking the ceph-osd processes.  Once I prevented kswapd from doing 
that, my OSDs couldn't recover.  noout and nodown didn't help; the OSDs 
would suicide and restart.

Upgrading to Ubuntu 14.04 seems to have helped.  The cluster isn't all 
clear yet, but it's getting better.  The cluster is finally healthy 
after 2 weeks of incomplete and stale.  It's still unresponsive, but 
it's making progress.  I am still seeing OSD's consuming 100% CPU, but 
only the OSDs that are actively deep-scrubing.  Once the deep-scrub 
finishes, the OSD starts behaving again.  They seem to be slowly getting 
better, which matches up with your explanation.

I'll go ahead at set noup.  I don't think it's necessary at this point, 
but it's not going to hurt.

I'm running Emperor, and looks like osd status isn't supported.  Not a 
big deal though.  Deep-scrub has made it through half of the PGs in the 
last 36 hours, so I'll just watch for another day or two. This is a 
slave cluster, so I have that luxury.

-- 

*Craig Lewis*
Senior Systems Engineer
Office +1.714.602.1309
Email clewis at centraldesktop.com <mailto:clewis at centraldesktop.com>

*Central Desktop. Work together in ways you never thought possible.*
Connect with us Website <http://www.centraldesktop.com/>  | Twitter 
<http://www.twitter.com/centraldesktop>  | Facebook 
<http://www.facebook.com/CentralDesktop>  | LinkedIn 
<http://www.linkedin.com/groups?gid=147417>  | Blog 
<http://cdblog.centraldesktop.com/>

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.ceph.com/pipermail/ceph-users-ceph.com/attachments/20140522/31748f64/attachment.htm>