Re: One pg stuck in active+undersized+degraded after OSD down

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On 11/22/21 08:12, David Tinker wrote:
I set osd.7 as "in", uncordened the node, scaled the OSD deployment back up and things are recovering with cluster status HEALTH_OK.

I found this message from the archives: https://www.mail-archive.com/ceph-users@xxxxxxxxxxxxxx/msg47071.html <https://www.mail-archive.com/ceph-users@xxxxxxxxxxxxxx/msg47071.html>

"You have a large difference in the capacities of the nodes. This resultsin a different host weight, which in turn might lead to problems withthe crush algorithm. It is not able to get three different hosts for OSDplacement for some of the PGs.

CEPH and crush do not cope well with heterogenous setups. I wouldsuggest to move one of the OSDs from host ceph1 to ceph4 to equalize thehost weight."

My nodes do have very different weights. What I am trying to do is re-install each node in the cluster so they all have the same amount of space for Ceph (much less than before .. we need more space for hostpath stuff).

# ceph osd tree
ID   CLASS  WEIGHT    TYPE NAME                               STATUS  REWEIGHT  PRI-AFF
  -1         13.77573  root default
  -5         13.77573      region FSN1
-22          0.73419          zone FSN1-DC13
-21                0              host node5-redacted-com
-27          0.73419              host node7-redacted-com
   1    ssd   0.36710                  osd.1                       up   1.00000  1.00000
   5    ssd   0.36710                  osd.5                       up   1.00000  1.00000
-10          6.20297          zone FSN1-DC14
  -9          6.20297              host node3-redacted-com
   2    ssd*3.10149*                   osd.2                       up   1.00000  1.00000
   4    ssd*3.10149*                   osd.4                       up   1.00000  1.00000
-18          3.19919          zone FSN1-DC15
-17*3.19919*               host node4-redacted-com
   7    ssd*3.19919*                   osd.7                     down         0  1.00000
  -4          2.90518          zone FSN1-DC16
  -3          2.90518              host node1-redacted-com
   0    ssd*1.45259*                   osd.0                       up   1.00000  1.00000
   3    ssd*1.45259*                   osd.3                       up   1.00000  1.00000
-14          0.73419          zone FSN1-DC18
-13                0              host node2-redacted-com
-25          0.73419              host node6-redacted-com
  10    ssd   0.36710                  osd.10                      up   1.00000  1.00000
  11    ssd   0.36710                  osd.11                      up   1.00000  1.00000


Should I just change the weights before/after removing OSD 7?

With something like "ceph osd crush reweight osd.7 1.0"?

The ceph balancer is there to balance PGs across all nodes. Do you have it enabled?

ceph balancer status

The most efficient way is to use mode upmap (should work with modern clients):

ceph balancer mode upmap

Gr. Stefn
_______________________________________________
ceph-users mailing list -- ceph-users@xxxxxxx
To unsubscribe send an email to ceph-users-leave@xxxxxxx



[Index of Archives]     [Information on CEPH]     [Linux Filesystem Development]     [Ceph Development]     [Ceph Large]     [Ceph Dev]     [Linux USB Development]     [Video for Linux]     [Linux Audio Users]     [Yosemite News]     [Linux Kernel]     [Linux SCSI]     [xfs]


  Powered by Linux