PGs go to down state when OSD fails

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



Hi,

I am trying to understand what happens when an OSD fails.

Few days back I wanted to check what happens when an OSD goes down for that what I did was I just went to the node and stopped one of the osd's service. When OSD went in down state pgs started recovering and after sometime everything seemed fine as everything was recovered and the osd went in OUT and DOWN state I thought great I don't really have to worry about loss of data on osd going down.
But recently an OSD went down on its own and I saw pgs were not able to recover they went to down state and everything was stuck, so I had to run this command
ceph osd lost osd_number
Which is not really safe and I might lose data here.
I am not able to understand why it did not happen when I stopped the service the first time and why did it actually happen. As in RF2 all OSD data is replicated to other osds so Ideally it should have come back in normal state on its own.

Can someone please explain what am I missing here?

Thanks
_______________________________________________
ceph-users mailing list
ceph-users@xxxxxxxxxxxxxx
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com

[Index of Archives]     [Information on CEPH]     [Linux Filesystem Development]     [Ceph Development]     [Ceph Large]     [Ceph Dev]     [Linux USB Development]     [Video for Linux]     [Linux Audio Users]     [Yosemite News]     [Linux Kernel]     [Linux SCSI]     [xfs]


  Powered by Linux