Thanks, but i am not quite understand how to determine weather monitor overloaded? and if yes,will start several monitor help? 发自我的 iPhone 在 2013-5-15,23:07,"Jim Schutt" <jaschut@xxxxxxxxxx> 写道: > On 05/14/2013 09:23 PM, Chen, Xiaoxi wrote: >>> How responsive generally is the machine under load? Is there available CPU? >> The machine works well, and the issued OSDs are likely the same, seems because they have relative slower disk( disk type are the same but the latency is a bit higher ,8ms -> 10ms). >> >> Top show no idle % but still have 30+% of io_wait, my colleague educate me that io_wait can be treated as free. >> >> Another information is offload the heartbeat to 1Gb nic doesn't solve the problem, what's more, when we doing random write test, we can still see this flipping happen. So I would like to say it may related with CPU scheduler ? The heartbeat thread (in busy OSD ) failed to get enough cpu cycle. >> > > FWIW, also take a close look at your monitor daemons, and > whether they show any signs of being overloaded. > > I frequently see OSDs wrongly marked down when my > mons cannot keep up with their workload. > > -- Jim > > _______________________________________________ ceph-users mailing list ceph-users@xxxxxxxxxxxxxx http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com