Re: One pg stuck in active+undersized+degraded after OSD down

David Tinker <david.tinker@xxxxxxxxx> · Mon, 22 Nov 2021 15:12:05 +0200

Yes it is on:

# ceph balancer status
{
    "active": true,
    "last_optimize_duration": "0:00:00.001867",
    "last_optimize_started": "Mon Nov 22 13:10:24 2021",
    "mode": "upmap",
    "optimize_result": "Unable to find further optimization, or pool(s)
pg_num is decreasing, or distribution is already perfect",
    "plans": []
}

On Mon, Nov 22, 2021 at 10:17 AM Stefan Kooman <stefan@xxxxxx> wrote:

> On 11/22/21 08:12, David Tinker wrote:
> > I set osd.7 as "in", uncordened the node, scaled the OSD deployment back
> > up and things are recovering with cluster status HEALTH_OK.
> >
> > I found this message from the archives:
> > https://www.mail-archive.com/ceph-users@xxxxxxxxxxxxxx/msg47071.html
> > <https://www.mail-archive.com/ceph-users@xxxxxxxxxxxxxx/msg47071.html>
> >
> > "You have a large difference in the capacities of the nodes. This
> > resultsin a different host weight, which in turn might lead to problems
> > withthe crush algorithm. It is not able to get three different hosts for
> > OSDplacement for some of the PGs.
> >
> > CEPH and crush do not cope well with heterogenous setups. I wouldsuggest
> > to move one of the OSDs from host ceph1 to ceph4 to equalize thehost
> > weight."
> >
> > My nodes do have very different weights. What I am trying to do is
> > re-install each node in the cluster so they all have the same amount of
> > space for Ceph (much less than before .. we need more space for hostpath
> > stuff).
> >
> > # ceph osd tree
> > ID   CLASS  WEIGHT    TYPE NAME                               STATUS
> REWEIGHT  PRI-AFF
> >   -1         13.77573  root default
> >   -5         13.77573      region FSN1
> > -22          0.73419          zone FSN1-DC13
> > -21                0              host node5-redacted-com
> > -27          0.73419              host node7-redacted-com
> >    1    ssd   0.36710                  osd.1                       up
>  1.00000  1.00000
> >    5    ssd   0.36710                  osd.5                       up
>  1.00000  1.00000
> > -10          6.20297          zone FSN1-DC14
> >   -9          6.20297              host node3-redacted-com
> >    2    ssd*3.10149*                   osd.2                       up
>  1.00000  1.00000
> >    4    ssd*3.10149*                   osd.4                       up
>  1.00000  1.00000
> > -18          3.19919          zone FSN1-DC15
> > -17*3.19919*               host node4-redacted-com
> >    7    ssd*3.19919*                   osd.7                     down
>      0  1.00000
> >   -4          2.90518          zone FSN1-DC16
> >   -3          2.90518              host node1-redacted-com
> >    0    ssd*1.45259*                   osd.0                       up
>  1.00000  1.00000
> >    3    ssd*1.45259*                   osd.3                       up
>  1.00000  1.00000
> > -14          0.73419          zone FSN1-DC18
> > -13                0              host node2-redacted-com
> > -25          0.73419              host node6-redacted-com
> >   10    ssd   0.36710                  osd.10                      up
>  1.00000  1.00000
> >   11    ssd   0.36710                  osd.11                      up
>  1.00000  1.00000
> >
> >
> > Should I just change the weights before/after removing OSD 7?
> >
> > With something like "ceph osd crush reweight osd.7 1.0"?
>
> The ceph balancer is there to balance PGs across all nodes. Do you have
> it enabled?
>
> ceph balancer status
>
> The most efficient way is to use mode upmap (should work with modern
> clients):
>
> ceph balancer mode upmap
>
> Gr. Stefn
>
_______________________________________________
ceph-users mailing list -- ceph-users@xxxxxxx
To unsubscribe send an email to ceph-users-leave@xxxxxxx