Hi, Sage Thanks! Will try it when next testing! ------------------ hzwulibin 2015-11-23 ------------------------------------------------------------- 发件人:Sage Weil <sage@xxxxxxxxxxxx> 发送日期:2015-11-22 01:49 收件人:Haomai Wang 抄送:Libin Wu,ceph-devel 主题:Re: why my cluster become unavailable On Sun, 22 Nov 2015, Haomai Wang wrote: > On Thu, Nov 19, 2015 at 11:26 PM, Libin Wu <hzwulibin@xxxxxxxxx> wrote: > > Hi, cepher > > > > I have a cluster of 6 OSD server, every server has 8 OSDs. > > > > I out 4 OSDs on every server, then my client io is blocking. > > > > I reboot my client and then create a new rbd device, but the new > > device also can't write io. > > > > Yeah, i understand that some data may lost as threee replicas of some > > object were lost, but why the cluster become unavailable? > > > > There 80 incomplete pg and 4 down+incomplete pg. > > > > Any solution i could solve the problem? > > Yes, if you doesn't have a special crushmap to control the data > replcement policy, pg will lack of necessary metadata to boot. If need > to readd outed osds or force remove pg which is incomplete(hope it's > just a test). Is min_size 2 or 1? Reducing it to 1 will generally clear some of the incomplete pgs. Just remember to raise it back to 2 after the cluster recovers. sage ��.n��������+%������w��{.n����z��u���ܨ}���Ơz�j:+v�����w����ޙ��&�)ߡ�a����z�ޗ���ݢj��w�f