new feature: auto removal of osds causing "stuck inactive"

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



Hi all,
I recently encountered a situation where some partially removed OSDs caused my cluster to enter a "stuck inactive" state. The eventually solution was to tell ceph the OSDs were "lost". Because all the PGs were replicated elsewhere on the cluster, no data was lost.

Would it make sense or be possible for Ceph to automatically detect this situation ("stuck inactive" and PGs replicated elsewhere) and automatically take action to un-stuck the cluster? E.g. automatically mark the OSD as lost or cause the OSD be down and out to have the same effect?

  Ideally anything that can be safely automated should be.  :)

Thanks!
C.
_______________________________________________
ceph-users mailing list
ceph-users@xxxxxxxxxxxxxx
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com



[Index of Archives]     [Information on CEPH]     [Linux Filesystem Development]     [Ceph Development]     [Ceph Large]     [Linux USB Development]     [Video for Linux]     [Linux Audio Users]     [Yosemite News]     [Linux Kernel]     [Linux SCSI]     [xfs]


  Powered by Linux