Re: how to recover from: 1 pgs down; 10 pgs incomplete; 10 pgs stuck inactive; 10 pgs stuck unclean

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On 13/07/15 15:40, Jelle de Jong wrote:
> I was testing a ceph cluster with osd_pool_default_size = 2 and while
> rebuilding the OSD on one ceph node a disk in an other node started
> getting read errors and ceph kept taking the OSD down, and instead of me
> executing ceph osd set nodown while the other node was rebuilding I kept
> restarting the OSD for a while and ceph took the OSD in for a few
> minutes and then taking it back down.
> 
> I then removed the bad OSD from the cluster and later added it back in
> with nodown flag set and a weight of zero, moving all the data away.
> Then removed the OSD again and added a new OSD with a new hard drive.
> 
> However I ended up with the following cluster status and I can't seem to
> find how to get the cluster healthy again. I'm doing this as tests
> before taking this ceph configuration in further production.
> 
> http://paste.debian.net/plain/281922
> 
> If I lost data, my bad, but how could I figure out in what pool the data
> was lost and in what rbd volume (so what kvm guest lost data).

Anybody that can help?

Can I somehow reweight some OSD to resolve the problems? Or should I
rebuild the whole cluster and loose all data?

Kind regards,

Jelle de Jong
_______________________________________________
ceph-users mailing list
ceph-users@xxxxxxxxxxxxxx
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com



[Index of Archives]     [Information on CEPH]     [Linux Filesystem Development]     [Ceph Development]     [Ceph Large]     [Ceph Dev]     [Linux USB Development]     [Video for Linux]     [Linux Audio Users]     [Yosemite News]     [Linux Kernel]     [Linux SCSI]     [xfs]


  Powered by Linux