On Tue, Dec 03, 2013 at 02:09:26PM +0800, 飞 wrote: > hello, I'm testing Ceph as storage for KVM virtual machine images, > my cluster have 3 mons and 3 data nodes, every data node have 8x2T SATA > HDD and 1 SSD for journal. > when I shutdown one data node to imitate server fault, the cluster begin > to recovery , when under recovery, > I can see many blocked requests, and the KVM VMs will be crash (crash as > they think their disk is offline), > how Can I solve this issue ? any idea ? thank you try to reduce the concurrent recovery threads by lowering "osd recovery max active" (default 5) it should really helps. Here "2" is ok. Also we had here a similar problem where xfs hang due to memory problems (and so recovery requests too). Upgrading kernel to 3.11 seems to fixed this. _______________________________________________ ceph-users mailing list ceph-users@xxxxxxxxxxxxxx http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com