Fixed all active+remapped PGs stuck forever (but I have no clue why)

chibi@xxxxxxx (Christian Balzer) · Thu, 14 Aug 2014 13:36:08 +0900

Hello,

On Thu, 14 Aug 2014 03:38:11 +0000 David Moreau Simard wrote:

> Hi,
> 
> Trying to update my continuous integration environment.. same deployment
> method with the following specs:
> - Ubuntu Precise, Kernel 3.2, Emperor (0.72.2) - Yields a successful,
> healthy cluster.
> - Ubuntu Trusty, Kernel 3.13, Firefly (0.80.5) - I have stuck placement
> groups.
> 
> Here?s some relevant bits from the Trusty/Firefly setup before I move on
> to what I?ve done/tried: http://pastebin.com/eqQTHcxU <? This was about
> halfway through PG healing.
> 
> So, the setup is three monitors, two other hosts on which there are 9
> OSDs each. At the beginning, all my placement groups were stuck unclean.
> 
And there's your reason why the firefly install "failed".
The default replication is 3 and you have just 2 storage nodes, combined
with the default CRUSH rules that's exactly what will happen.
To avoid this from the start either use 3 nodes or set
---
osd_pool_default_size = 2
osd_pool_default_min_size = 1
---
in your ceph.conf very early on, before creating anything, especially
OSDs. 

Setting the replication for all your pools to 2 with "ceph osd pool <name>
set size 2" as the first step after your install should have worked, too.

But with all the things you tried, I can't really tell you why things
behaved they way they did for you.

Christian

> I tried the easy things first:
> - set crush tunables to optimal
> - run repairs/scrub on OSDs
> - restart OSDs
> 
> Nothing happened. All ~12000 PGs remained stuck unclean since forever
> active+remapped. Next, I played with the crush map. I deleted the
> default replicated_ruleset rule and created a (basic) rule for each pool
> for the time being. I set the pools to use their respective rule and
> also reduced their size to 2 and min_size to 1.
> 
> Still nothing, all PGs stuck.
> I?m not sure why but I tried setting the crush tunables to legacy - I
> guess in a trial and error attempt.
> 
> Half my PGs healed almost immediately. 6082 PGs remained in
> active+remapped. I try running scrubs/repairs - it won?t heal the other
> half. I set the tunables back to optimal, still nothing.
> 
> I set tunables to legacy again and most of them end up healing with only
> 1335 left in active+remapped.
> 
> The remainder of the PGs healed when I restarted the OSDs.
> 
> Does anyone have a clue why this happened ?
> It looks like switching back and forth between tunables fixed the stuck
> PGs ?
> 
> I can easily reproduce this if anyone wants more info.
> 
> Let me know !
> --
> David Moreau Simard
> 
> _______________________________________________
> ceph-users mailing list
> ceph-users at lists.ceph.com
> http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
> 

-- 
Christian Balzer        Network/Systems Engineer                
chibi at gol.com   	Global OnLine Japan/Fusion Communications
http://www.gol.com/