Re: stuck with active+undersized+degraded on Jewel after cluster maintenance

Pawel S <pejotes@xxxxxxxxx> · Fri, 3 Aug 2018 15:27:33 +0200

On Fri, Aug 3, 2018 at 2:07 PM Paweł Sadowsk <ceph@xxxxxxxxx> wrote:
On 08/03/2018 01:45 PM, Pawel S wrote:

> hello!

>

> We did maintenance works (cluster shrinking) on one cluster (jewel)

> and after shutting one of osds down we found this situation where

> recover of pg can't be started because of "querying" one of peers. We

> restarted this OSD, tried to out and in. Nothing helped, finally we

> moved out data (the pg was still on it) and removed this osd from

> crush and whole cluster. But recover can't start on any other osd to

> create this copy again. We still have valid active 2 copies, but we

> would like to have it clean. 

> How we can push recover to have this third copy somewhere ?

> Replication size is 3 on hosts and there are plenty of them.  

>

> Status now: 

>    health HEALTH_WARN

>             1 pgs degraded

>             1 pgs stuck degraded

>             1 pgs stuck unclean

>             1 pgs stuck undersized

>             1 pgs undersized

>             recovery 268/19265130 objects degraded (0.001%)

>

> Link to PG query details, health status and version commit here:

> https://gist.github.com/pejotes/aea71ecd2718dbb3ceab0e648924d06b

Can you add 'ceph osd tree', 'ceph osd crush show-tunables' and 'ceph

osd crush rule dump'? Looks like crush is not able to find place for 3rd

copy due to big difference in weight of rack/host depending on your

crush rules.

yes, you were right :-) 

Quickly went through the alg and found it's simply don't have enough tries (as a workaround) to handle this weight difference (I had 54, 115, 145) in my failure domains. Increasing "choose_total_tries" to 100 did the trick. Rules were set to choose on datacenter buckets created from racks and hosts. Next step will be to balance weight of datacenter buckets to equalize it a bit, couple of OSDs can be removed. :-)
Thank you Pawel!

best regards!
Pawel 
_______________________________________________
ceph-users mailing list
ceph-users@xxxxxxxxxxxxxx
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com