Re: inactive PGs looking for a non existent OSD

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



Hi,

what exactly is your question? You seem to have made progress in bringing OSDs back up and reducing inactive PGs. What is unexpected to me is that one host failure would cause inactive PGs. Can you share more details about your osd tree and crush rules of the affected inactive PGs? Usually, a ceph cluster should be resilient to a host failure if set up properly. So after your host failed I would have expected that ceph recovers the degraded PGs to different hosts. Did the recovery not happen?

Regards,
Eugen

Zitat von Alfredo Rezinovsky <alfrenovsky@xxxxxxxxx>:

I had a problem with a server, hardware completely broken.

"ceph orch rm host"  hanged, even with force and offline options

I reinstalled other server with the same IP address and then I removed the
OSD with:

ceph osd purge osd.10
ceph osd purge osd.11

Now I have 0.342% pgs not active

with

ceph pg <pg.id> query

I can see the PG is blocked by a non existent OSD.10 or 11 (in the other
problematic PG)

I already tried setting

osd_find_best_info_ignore_history_les = false

in the intervening OSDs and restarted them with some luck (I had 3 non
active PGs, now I have 2)

Also after that another OSD keeps restarting. Fixed that by setting the
reweight to 0 and still waiting until the OSD is empty to destroy it.


--
Alfrenovsky
_______________________________________________
ceph-users mailing list -- ceph-users@xxxxxxx
To unsubscribe send an email to ceph-users-leave@xxxxxxx


_______________________________________________
ceph-users mailing list -- ceph-users@xxxxxxx
To unsubscribe send an email to ceph-users-leave@xxxxxxx



[Index of Archives]     [Information on CEPH]     [Linux Filesystem Development]     [Ceph Development]     [Ceph Large]     [Ceph Dev]     [Linux USB Development]     [Video for Linux]     [Linux Audio Users]     [Yosemite News]     [Linux Kernel]     [Linux SCSI]     [xfs]


  Powered by Linux