Re: Luminous cluster in very bad state need some assistance.

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



So i restarted the osd but he stop after some time. But this is an effect on the cluster and cluster is on a partial recovery process.

please find here log file of osd 49 after this restart 
https://filesender.belnet.be/?s=download&token=8c9c39f2-36f6-43f7-bebb-175679d27a22

Kr

Philippe.

________________________________________
From: Philippe Van Hecke
Sent: 04 February 2019 07:42
To: Sage Weil
Cc: ceph-users@xxxxxxxxxxxxxx; Belnet Services
Subject: Re:  Luminous cluster in very bad state need some assistance.

oot@ls-node-5-lcl:~# ceph-objectstore-tool --data-path /var/lib/ceph/osd/ceph-49/ --journal /var/lib/ceph/osd/ceph-49/journal --pgid 11.182 --op remove --debug --force  2> ceph-objectstore-tool-export-remove.txt
 marking collection for removal
setting '_remove' omap key
finish_remove_pgs 11.182_head removing 11.182
Remove successful

So now i suppose i restart the osd and see


________________________________________
From: Sage Weil <sage@xxxxxxxxxxxx>
Sent: 04 February 2019 07:37
To: Philippe Van Hecke
Cc: ceph-users@xxxxxxxxxxxxxx; Belnet Services
Subject: Re:  Luminous cluster in very bad state need some assistance.

On Mon, 4 Feb 2019, Philippe Van Hecke wrote:
> result of  ceph pg ls | grep 11.118
>
> 11.118     9788                  0        0         0       0 40817837568 1584     1584                             active+clean 2019-02-01 12:48:41.343228  70238'19811673  70493:34596887  [121,24]        121  [121,24]            121  69295'19811665 2019-02-01 12:48:41.343144  66131'19810044 2019-01-30 11:44:36.006505
>
> cp done.
>
> So i can make  ceph-objecstore-tool --op remove command ?

yep!


>
> ________________________________________
> From: Sage Weil <sage@xxxxxxxxxxxx>
> Sent: 04 February 2019 07:26
> To: Philippe Van Hecke
> Cc: ceph-users@xxxxxxxxxxxxxx; Belnet Services
> Subject: Re:  Luminous cluster in very bad state need some assistance.
>
> On Mon, 4 Feb 2019, Philippe Van Hecke wrote:
> > Hi Sage,
> >
> > I try to make the following.
> >
> > ceph-objectstore-tool --data-path /var/lib/ceph/osd/ceph-49/ --journal /var/lib/ceph/osd/ceph-49/journal --pgid 11.182 --op export-remove --debug --file /tmp/export-pg/18.182 2>ceph-objectstore-tool-export-remove.txt
> > but this rise exception
> >
> > find here  https://filesender.belnet.be/?s=download&token=e2b1fdbc-0739-423f-9d97-0bd258843a33 file ceph-objectstore-tool-export-remove.txt
>
> In that case,  cp --preserve=all
> /var/lib/ceph/osd/ceph-49/current/11.182_head to a safe location and then
> use the ceph-objecstore-tool --op remove command.  But first confirm that
> 'ceph pg ls' shows the PG as active.
>
> sage
>
>
>  > > Kr
> >
> > Philippe.
> >
> > ________________________________________
> > From: Sage Weil <sage@xxxxxxxxxxxx>
> > Sent: 04 February 2019 06:59
> > To: Philippe Van Hecke
> > Cc: ceph-users@xxxxxxxxxxxxxx; Belnet Services
> > Subject: Re:  Luminous cluster in very bad state need some assistance.
> >
> > On Mon, 4 Feb 2019, Philippe Van Hecke wrote:
> > > Hi Sage, First of all tanks for your help
> > >
> > > Please find here  https://filesender.belnet.be/?s=download&token=dea0edda-5b6a-4284-9ea1-c1fdf88b65e9
> > > the osd log with debug info for osd.49. and indeed if all buggy osd can restart that can may be solve the issue.
> > > But i also happy that you confirm my understanding that in the worst case removing pool can also resolve the problem even in this case i lose data  but finish with a working cluster.
> >
> > If PGs are damaged, removing the pool would be part of getting to
> > HEALTH_OK, but you'd probably also need to remove any problematic PGs that
> > are preventing the OSD starting.
> >
> > But keep in mind that (1) i see 3 PGs that don't peer spread across pools
> > 11 and 12; not sure which one you are considering deleting.  Also (2) if
> > one pool isn't fully available it generall won't be a problem for other
> > pools, as long as the osds start.  And doing ceph-objectstore-tool
> > export-remove is a pretty safe way to move any problem PGs out of the way
> > to get your OSDs starting--just make sure you hold onto that backup/export
> > because you may need it later!
> >
> > > PS: don't know and don't want to open debat about top/bottom posting but would like to know the preference of this list :-)
> >
> > No preference :)
> >
> > sage
> >
> >
>
>
_______________________________________________
ceph-users mailing list
ceph-users@xxxxxxxxxxxxxx
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com



[Index of Archives]     [Information on CEPH]     [Linux Filesystem Development]     [Ceph Development]     [Ceph Large]     [Ceph Dev]     [Linux USB Development]     [Video for Linux]     [Linux Audio Users]     [Yosemite News]     [Linux Kernel]     [Linux SCSI]     [xfs]


  Powered by Linux