On Mon, 4 Feb 2019, Philippe Van Hecke wrote: > Hi Sage, First of all tanks for your help > > Please find here https://filesender.belnet.be/?s=download&token=dea0edda-5b6a-4284-9ea1-c1fdf88b65e9 > the osd log with debug info for osd.49. and indeed if all buggy osd can restart that can may be solve the issue. > But i also happy that you confirm my understanding that in the worst case removing pool can also resolve the problem even in this case i lose data but finish with a working cluster. If PGs are damaged, removing the pool would be part of getting to HEALTH_OK, but you'd probably also need to remove any problematic PGs that are preventing the OSD starting. But keep in mind that (1) i see 3 PGs that don't peer spread across pools 11 and 12; not sure which one you are considering deleting. Also (2) if one pool isn't fully available it generall won't be a problem for other pools, as long as the osds start. And doing ceph-objectstore-tool export-remove is a pretty safe way to move any problem PGs out of the way to get your OSDs starting--just make sure you hold onto that backup/export because you may need it later! > PS: don't know and don't want to open debat about top/bottom posting but would like to know the preference of this list :-) No preference :) sage _______________________________________________ ceph-users mailing list ceph-users@xxxxxxxxxxxxxx http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com