Hi, Sage I has a question about min_size of pool. The default value of min_size is 2, but in this setting, when two OSDs are down(mean two replicas lost) at same time, the IO will be blocked. We want to set the min_size to 1 in our production environment as we think it's normal case when two OSDs are down(sure on different host) at same time. So is there anypotential problem of this setting? We use 0.80.10 version. Thanks! ------------------ hzwulibin 2015-11-26 ------------------------------------------------------------- 发件人:"hzwulibin"<hzwulibin@xxxxxxxxx> 发送日期:2015-11-23 09:00 收件人:Sage Weil,Haomai Wang 抄送:ceph-devel 主题:Re: why my cluster become unavailable Hi, Sage Thanks! Will try it when next testing! ------------------ hzwulibin 2015-11-23 ------------------------------------------------------------- 发件人:Sage Weil <sage@xxxxxxxxxxxx> 发送日期:2015-11-22 01:49 收件人:Haomai Wang 抄送:Libin Wu,ceph-devel 主题:Re: why my cluster become unavailable On Sun, 22 Nov 2015, Haomai Wang wrote: > On Thu, Nov 19, 2015 at 11:26 PM, Libin Wu <hzwulibin@xxxxxxxxx> wrote: > > Hi, cepher > > > > I have a cluster of 6 OSD server, every server has 8 OSDs. > > > > I out 4 OSDs on every server, then my client io is blocking. > > > > I reboot my client and then create a new rbd device, but the new > > device also can't write io. > > > > Yeah, i understand that some data may lost as threee replicas of some > > object were lost, but why the cluster become unavailable? > > > > There 80 incomplete pg and 4 down+incomplete pg. > > > > Any solution i could solve the problem? > > Yes, if you doesn't have a special crushmap to control the data > replcement policy, pg will lack of necessary metadata to boot. If need > to readd outed osds or force remove pg which is incomplete(hope it's > just a test). Is min_size 2 or 1? Reducing it to 1 will generally clear some of the incomplete pgs. Just remember to raise it back to 2 after the cluster recovers. sage ��.n��������+%������w��{.n����z��u���ܨ}���Ơz�j:+v�����w����ޙ��&�)ߡ�a����z�ޗ���ݢj��w�f