I have a 3 node production cluster. All works fine. but i have one failing node. i replaced one disk on sunday. everyting went fine. last night there was another disk broken. Ceph nicely maks it as down. but when i wanted to reboot this node now. all remaining osd's are being kept in and not marked as down. and the whole cluster locks during the reboot of this node. once i reboot one of the other two nodes when the first failing node is back it works like charm. only this node i cannot reboot anymore without locking which i could still on sunday...
_______________________________________________ ceph-users mailing list ceph-users@xxxxxxxxxxxxxx http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com