oot@ls-node-5-lcl:~# ceph-objectstore-tool --data-path /var/lib/ceph/osd/ceph-49/ --journal /var/lib/ceph/osd/ceph-49/journal --pgid 11.182 --op remove --debug --force 2> ceph-objectstore-tool-export-remove.txt marking collection for removal setting '_remove' omap key finish_remove_pgs 11.182_head removing 11.182 Remove successful So now i suppose i restart the osd and see ________________________________________ From: Sage Weil <sage@xxxxxxxxxxxx> Sent: 04 February 2019 07:37 To: Philippe Van Hecke Cc: ceph-users@xxxxxxxxxxxxxx; Belnet Services Subject: Re: Luminous cluster in very bad state need some assistance. On Mon, 4 Feb 2019, Philippe Van Hecke wrote: > result of ceph pg ls | grep 11.118 > > 11.118 9788 0 0 0 0 40817837568 1584 1584 active+clean 2019-02-01 12:48:41.343228 70238'19811673 70493:34596887 [121,24] 121 [121,24] 121 69295'19811665 2019-02-01 12:48:41.343144 66131'19810044 2019-01-30 11:44:36.006505 > > cp done. > > So i can make ceph-objecstore-tool --op remove command ? yep! > > ________________________________________ > From: Sage Weil <sage@xxxxxxxxxxxx> > Sent: 04 February 2019 07:26 > To: Philippe Van Hecke > Cc: ceph-users@xxxxxxxxxxxxxx; Belnet Services > Subject: Re: Luminous cluster in very bad state need some assistance. > > On Mon, 4 Feb 2019, Philippe Van Hecke wrote: > > Hi Sage, > > > > I try to make the following. > > > > ceph-objectstore-tool --data-path /var/lib/ceph/osd/ceph-49/ --journal /var/lib/ceph/osd/ceph-49/journal --pgid 11.182 --op export-remove --debug --file /tmp/export-pg/18.182 2>ceph-objectstore-tool-export-remove.txt > > but this rise exception > > > > find here https://filesender.belnet.be/?s=download&token=e2b1fdbc-0739-423f-9d97-0bd258843a33 file ceph-objectstore-tool-export-remove.txt > > In that case, cp --preserve=all > /var/lib/ceph/osd/ceph-49/current/11.182_head to a safe location and then > use the ceph-objecstore-tool --op remove command. But first confirm that > 'ceph pg ls' shows the PG as active. > > sage > > > > > Kr > > > > Philippe. > > > > ________________________________________ > > From: Sage Weil <sage@xxxxxxxxxxxx> > > Sent: 04 February 2019 06:59 > > To: Philippe Van Hecke > > Cc: ceph-users@xxxxxxxxxxxxxx; Belnet Services > > Subject: Re: Luminous cluster in very bad state need some assistance. > > > > On Mon, 4 Feb 2019, Philippe Van Hecke wrote: > > > Hi Sage, First of all tanks for your help > > > > > > Please find here https://filesender.belnet.be/?s=download&token=dea0edda-5b6a-4284-9ea1-c1fdf88b65e9 > > > the osd log with debug info for osd.49. and indeed if all buggy osd can restart that can may be solve the issue. > > > But i also happy that you confirm my understanding that in the worst case removing pool can also resolve the problem even in this case i lose data but finish with a working cluster. > > > > If PGs are damaged, removing the pool would be part of getting to > > HEALTH_OK, but you'd probably also need to remove any problematic PGs that > > are preventing the OSD starting. > > > > But keep in mind that (1) i see 3 PGs that don't peer spread across pools > > 11 and 12; not sure which one you are considering deleting. Also (2) if > > one pool isn't fully available it generall won't be a problem for other > > pools, as long as the osds start. And doing ceph-objectstore-tool > > export-remove is a pretty safe way to move any problem PGs out of the way > > to get your OSDs starting--just make sure you hold onto that backup/export > > because you may need it later! > > > > > PS: don't know and don't want to open debat about top/bottom posting but would like to know the preference of this list :-) > > > > No preference :) > > > > sage > > > > > > _______________________________________________ ceph-users mailing list ceph-users@xxxxxxxxxxxxxx http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com