答复: 答复: 答复: 2 replications,flapping can not stop for a very long time

"zhao.mingyue@xxxxxxx" <zhao.mingyue@xxxxxxx> · Tue, 15 Sep 2015 02:56:00 +0000

The OSD is supposed to stay down if any of the networks are missing.
----------------osd1 osd2 osd3,if cut off osd2's cluster-network,then all of the osd( osd1/osd2/osd3) will all stay down?
Ceph is a CP system in CAP parlance; there's no such thing as a CA system. ;)

What version of Ceph are you testing right now?-------------- ceph version：0.94.2
-Greg

-----邮件原件-----
发件人: Gregory Farnum [mailto:gfarnum@xxxxxxxxxx] 
发送时间: 2015年9月15日 4:57
收件人: zhaomingyue 09440 (RD)
抄送: Chen, Xiaoxi; huang jun; ceph-devel@xxxxxxxxxxxxxxx
主题: Re: 答复: 答复: 2 replications,flapping can not stop for a very long time

The OSD is supposed to stay down if any of the networks are missing.
Ceph is a CP system in CAP parlance; there's no such thing as a CA system. ;)

What version of Ceph are you testing right now?
-Greg

On Mon, Sep 14, 2015 at 1:02 AM, zhao.mingyue@xxxxxxx <zhao.mingyue@xxxxxxx> wrote:
> Thanks for your explaination~
> you mean the origin design leads to this result?In that network situation,flapping must appear and can not terminate except that someone(such as manager) force marking osd out,isn't it?
> I also want to ask about whether ceph community has a plan to solve this problem or not?
>
> Thanks~
>
> -----邮件原件-----
> 发件人: Chen, Xiaoxi [mailto:xiaoxi.chen@xxxxxxxxx]
> 发送时间: 2015年9月14日 15:21
> 收件人: huang jun; zhaomingyue 09440 (RD)
> 抄送: ceph-devel@xxxxxxxxxxxxxxx
> 主题: RE: 答复: 2 replications,flapping can not stop for a very long time
>
> This is kind of unsolvable problem, in CAP , we choose Consistency and Availability, thus we had to lose Partition tolerance.
>
> There are three networks here , mon<-> osd, osd<-public->osd, osd<- cluster-> osd. If some of the networks are reachable but some are not, likely the flipping will happen.
>
> -----Original Message-----
> From: ceph-devel-owner@xxxxxxxxxxxxxxx 
> [mailto:ceph-devel-owner@xxxxxxxxxxxxxxx] On Behalf Of huang jun
> Sent: Sunday, September 13, 2015 5:46 PM
> To: zhao.mingyue@xxxxxxx
> Cc: ceph-devel@xxxxxxxxxxxxxxx
> Subject: Re: 答复: 2 replications,flapping can not stop for a very long 
> time
>
> 2015-09-13 14:07 GMT+08:00 zhao.mingyue@xxxxxxx <zhao.mingyue@xxxxxxx>:
>> hi, do you set both public_network and cluster_network, but just cut off the cluster_network?
>> And do you have not only one osd on the same host?
>> =============================yes,public network+cluster network,and I 
>> cut off the cluster network; 2 node ,each node has serveral osds;
>>
>> If so, maybe you can not get stable, now the osd have peers in the prev and next osd id, they can exchange ping message.
>> you cut off the cluster_network, the outbox peer osds can not detect the ping, they reports the osd failure to MON, and MON gather enough reporters and reports, then the osd will be marked down.
>> =============================when osd recv a new map and it is marked down,it think MON wrongly mark me down,what will it do,join the cluster again or other actions?can you give me some more detailed explanation?
>
> It will send a boot message to MON, and will be marked UP by MON.
>>
>> But the osd can reports to MON bc the public_network is ok,  MON thinks the osd wronly marked down, mark it to UP.
>> =============================you mean that MON recv message ONE TIME from this osd then it will mark this osd up?
>>
>
>> So flapping happens again and again.
>> ============================= I tried 3 replications,(public network 
>> + cluster network,3 node,each node has serveral osds),although it 
>> will occur flapping,but after serveral minutes it will be stable, compared with 2 replications situation, I wait for the same intervals,the cluster can not be stable; so I'm confused about the machnism that how monitor can decide which osd is actually down?
>>
> It's weird, if you cut off the cluster_network, the osds in other node can not get the ping messages, and naturally think the osd is failed.
>
>> thanks
>>
>> -----邮件原件-----
>> 发件人: huang jun [mailto:hjwsm1989@xxxxxxxxx]
>> 发送时间: 2015年9月13日 10:39
>> 收件人: zhaomingyue 09440 (RD)
>> 抄送: ceph-devel@xxxxxxxxxxxxxxx
>> 主题: Re: 2 replications,flapping can not stop for a very long time
>>
>> hi, do you set both public_network and cluster_network, but just cut off the cluster_network?
>> And do you have not only one osd on the same host?
>> If so, maybe you can not get stable, now the osd have peers in the prev and next osd id, they can exchange ping message.
>> you cut off the cluster_network, the outbox peer osds can not detect the ping, they reports the osd failure to MON, and MON gather enough reporters and reports, then the osd will be marked down.
>> But the osd can reports to MON bc the public_network is ok,  MON thinks the osd wronly marked down, mark it to UP.
>> So flapping happens again and again.
>>
>> 2015-09-12 20:26 GMT+08:00 zhao.mingyue@xxxxxxx <zhao.mingyue@xxxxxxx>:
>>>
>>> Hi,
>>> I'm testing reliability of ceph recently, and I have met the flapping problem.
>>> I have 2 replications, and cut off the cluster network ,now  flapping can not stop,I have wait more than 30min, but status of osds are still not stable;
>>>     I want to know about  when monitor recv reports from osds ,how it can mark one osd down?
>>>     (reports && reporter && grace) need to satisfied some conditions, how to calculate the grace?
>>> and how long will the flapping  stop?Does the flapping must be stopped by configure,such as configure an osd lost?
>>> Can someone help me ?
>>> Thanks~
>>> --------------------------------------------------------------------
>>> -
>>> -
>>> ---------------------------------------------------------------
>>> 本邮件及其附件含有杭州华三通信技术有限公司的保密信息，仅限于发送给上面地址中列出
>>> 的个人或群组。禁止任何其他人以任何形式使用（包括但不限于全部或部分地泄露、复制、
>>> 或散发）本邮件中的信息。如果您错收了本邮件，请您立即电话或邮件通知发件人并删除本
>>> 邮件！
>>> This e-mail and its attachments contain confidential information 
>>> from H3C, which is intended only for the person or entity whose 
>>> address is listed above. Any use of the information contained herein 
>>> in any way (including, but not limited to, total or partial 
>>> disclosure, reproduction, or dissemination) by persons other than 
>>> the intended
>>> recipient(s) is prohibited. If you receive this e-mail in error, 
>>> please notify the sender by phone or email immediately and delete it!
>>
>>
>>
>> --
>> thanks
>> huangjun
>
>
>
> --
> thanks
> huangjun
> --
> To unsubscribe from this list: send the line "unsubscribe ceph-devel" 
> in the body of a message to majordomo@xxxxxxxxxxxxxxx More majordomo 
> info at  http://vger.kernel.org/majordomo-info.html
��.n��������+%������w��{.n����z��u���ܨ}���Ơz�j:+v�����w����ޙ��&�)ߡ�a����z�ޗ���ݢj��w�f