Re: pgs stuck unclean

Tomasz Kuzemko <tomasz.kuzemko@xxxxxxxxxxxx> · Fri, 17 Feb 2017 06:34:26 +0000

If the PG cannot be queried I would bet on OSD message throttler. Check with "ceph --admin-daemon PATH_TO_ADMIN_SOCK perf dump" on each OSD which is holding this PG  if message throttler current value is not equal max. If it is, increase the max value in ceph.conf and restart OSD.

--
Tomasz Kuzemko
tomasz.kuzemko@xxxxxxxxxxxx

Dnia 17.02.2017 o godz. 01:59 Matyas Koszik <koszik@xxxxxx> napisał(a):

> 
> Hi,
> 
> It seems that my ceph cluster is in an erroneous state of which I cannot
> see right now how to get out of.
> 
> The status is the following:
> 
> health HEALTH_WARN
>       25 pgs degraded
>       1 pgs stale
>       26 pgs stuck unclean
>       25 pgs undersized
>       recovery 23578/9450442 objects degraded (0.249%)
>       recovery 45/9450442 objects misplaced (0.000%)
>       crush map has legacy tunables (require bobtail, min is firefly)
> monmap e17: 3 mons at x
>       election epoch 8550, quorum 0,1,2 store1,store3,store2
> osdmap e66602: 68 osds: 68 up, 68 in; 1 remapped pgs
>       flags require_jewel_osds
> pgmap v31433805: 4388 pgs, 8 pools, 18329 GB data, 4614 kobjects
>       36750 GB used, 61947 GB / 98697 GB avail
>       23578/9450442 objects degraded (0.249%)
>       45/9450442 objects misplaced (0.000%)
>           4362 active+clean
>             24 active+undersized+degraded
>              1 stale+active+undersized+degraded+remapped
>              1 active+remapped
> 
> 
> I tried restarting all OSDs, to no avail, it actually made things a bit
> worse.
> From a user point of view the cluster works perfectly (apart from that
> stale pg, which fortunately hit the pool on which I keep swap images
> only).
> 
> A little background: I made the mistake of creating the cluster with
> size=2 pools, which I'm now in the process of rectifying, but that
> requires some fiddling around. I also tried moving to more optimal
> tunables (firefly), but the documentation is a bit optimistic
> with the 'up to 10%' data movement - it was over 50% in my case, so I
> reverted to bobtail immediately after I saw that number. I then started
> reweighing the osds in anticipation of the size=3 bump, and I think that's
> when this bug hit me.
> 
> Right now I have a pg (6.245) that cannot even be queried - the command
> times out, or gives this output: https://atw.hu/~koszik/ceph/pg6.245
> 
> I queried a few other pgs that are acting up, but cannot see anything
> suspicious, other than the fact they do not have a working peer:
> https://atw.hu/~koszik/ceph/pg4.2ca
> https://atw.hu/~koszik/ceph/pg4.2e4
> 
> Health details can be found here: https://atw.hu/~koszik/ceph/health
> OSD tree: https://atw.hu/~koszik/ceph/tree (here the weight sum of
> ssd/store3_ssd seems to be off, but that has been the case for quite some
> time - not sure if it's related to any of this)
> 
> 
> I tried setting debugging to 20/20 on some of the affected osds, but there
> was nothing there that gave me any ideas on solving this. How should I
> continue debugging this issue?
> 
> BTW, I'm runnig 10.2.5 on all of my osd/mon nodes.
> 
> Thanks,
> Matyas
> 
> 
> _______________________________________________
> ceph-users mailing list
> ceph-users@xxxxxxxxxxxxxx
> http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
_______________________________________________
ceph-users mailing list
ceph-users@xxxxxxxxxxxxxx
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com