Re: ceph pg stuck active+remapped+backfilling

Eugen Block <eblock@xxxxxx> · Fri, 02 Aug 2024 11:42:12 +0000

First, I would restart the active mgr, the current status might be  
outdated, I've seen that lots of times. If the pg is still in remapped  
state, you'll need to provide a lot more information about your  
cluster, the current osd tree, ceph status, the applied crush rule  
etc. One possible root cause is that crush gives up too soon, this can  
be adjusted (increase set_choose_tries) in the rule.

Zitat von Jorge Garcia <jgarcia@xxxxxxxxxxxx>:

We were having an OSD reporting lots of errors, so I tried to remove
it by doing:

  ceph orch osd rm 139 --zap

It started moving all the data. Eventually, we got to the point that
there's only 1 pg backfilling, but that seems to be stuck now. I think
it may be because, in the process, another OSD (103) started reporting
errors, too. The pool is erasure k:5 m:2, so it should still be OK. I
don't see any progress happening on the backfill, and ceph -s has been
reporting "69797/925734949 objects misplaced (0.008%)" for days now.
How can I get it to finish the backfilling, or at least find out why
it's not working?

ceph version: quincy

Thanks!

Jorge
_______________________________________________
ceph-users mailing list -- ceph-users@xxxxxxx
To unsubscribe send an email to ceph-users-leave@xxxxxxx

_______________________________________________
ceph-users mailing list -- ceph-users@xxxxxxx
To unsubscribe send an email to ceph-users-leave@xxxxxxx