[ceph-users] Fixed all active+remapped PGs stuck forever (but I have no clue why)

pdrake@xxxxxxxxx (Peter Drake) · Sun, 14 Sep 2014 02:54:21 +0000 (UTC)

David Moreau Simard <dmsimard at ...> writes:

> 
> Hi,
> 
> Trying to update my continuous integration environment.. same deployment 
method with the following specs:
> - Ubuntu Precise, Kernel 3.2, Emperor (0.72.2) - Yields a successful, 
healthy cluster.
> - Ubuntu Trusty, Kernel 3.13, Firefly (0.80.5) - I have stuck placement 
groups.
> 
> Here?s some relevant bits from the Trusty/Firefly setup before I move on 
to what I?ve done/tried:
> http://pastebin.com/eqQTHcxU <? This was about halfway through PG healing.
> 
> So, the setup is three monitors, two other hosts on which there are 9 OSDs 
each.
> At the beginning, all my placement groups were stuck unclean.
> 
> I tried the easy things first:
> - set crush tunables to optimal
> - run repairs/scrub on OSDs
> - restart OSDs
> 
> Nothing happened. All ~12000 PGs remained stuck unclean since forever 
active+remapped.
> Next, I played with the crush map. I deleted the default 
replicated_ruleset rule and created a (basic) rule
> for each pool for the time being.
> I set the pools to use their respective rule and also reduced their size 
to 2 and min_size to 1.
> 
> Still nothing, all PGs stuck.
> I?m not sure why but I tried setting the crush tunables to legacy - I 
guess in a trial and error attempt.
> 
> Half my PGs healed almost immediately. 6082 PGs remained in 
active+remapped.
> I try running scrubs/repairs - it won?t heal the other half. I set the 
tunables back to optimal, still nothing.
> 
> I set tunables to legacy again and most of them end up healing with only 
1335 left in active+remapped.
> 
> The remainder of the PGs healed when I restarted the OSDs.
> 
> Does anyone have a clue why this happened ?
> It looks like switching back and forth between tunables fixed the stuck 
PGs ?
> 
> I can easily reproduce this if anyone wants more info.
> 
> Let me know !
> --
> David Moreau Simard
> 

I recently encountered the exact same problem.  I have been working on a new 
cloud deployment using vagrant to simulate the physical hosts.  I have 4 
hosts, each is both a mon and osd for testing purposes.

System details:
Ubuntu Trusty (14.04)
Kernel 3.13
Firefly 0.80.5

On deployment of a new cluster, all of my pgs were stuck (HEALTH_WARN 320 
pgs incomplete; 320 pgs stuck inactive; 320 pgs stuck unclean).  I tried a 
ton of recommended processes for getting them working and nothing could get 
them to budge.  I did `ceph osd crush tunables legacy` and all 320 pgs went 
from stuck to active.  This is definitely repeatable as I can deploy a new 
cluster with vagrant/puppet and this happens every time.

So, thank you for posting this work-around.

Peter