Re: Luminous cluster in very bad state need some assistance.

Sage Weil <sage@xxxxxxxxxxxx> · Mon, 4 Feb 2019 05:59:13 +0000 (UTC)

On Mon, 4 Feb 2019, Philippe Van Hecke wrote:
> Hi Sage, First of all tanks for your help
> 
> Please find here  https://filesender.belnet.be/?s=download&token=dea0edda-5b6a-4284-9ea1-c1fdf88b65e9
> the osd log with debug info for osd.49. and indeed if all buggy osd can restart that can may be solve the issue.
> But i also happy that you confirm my understanding that in the worst case removing pool can also resolve the problem even in this case i lose data  but finish with a working cluster.

If PGs are damaged, removing the pool would be part of getting to 
HEALTH_OK, but you'd probably also need to remove any problematic PGs that 
are preventing the OSD starting.

But keep in mind that (1) i see 3 PGs that don't peer spread across pools 
11 and 12; not sure which one you are considering deleting.  Also (2) if 
one pool isn't fully available it generall won't be a problem for other 
pools, as long as the osds start.  And doing ceph-objectstore-tool 
export-remove is a pretty safe way to move any problem PGs out of the way 
to get your OSDs starting--just make sure you hold onto that backup/export 
because you may need it later!

> PS: don't know and don't want to open debat about top/bottom posting but would like to know the preference of this list :-)

No preference :)

sage
_______________________________________________
ceph-users mailing list
ceph-users@xxxxxxxxxxxxxx
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com