The OSD is supposed to stay down if any of the networks are missing. Ceph is a CP system in CAP parlance; there's no such thing as a CA system. ;) What version of Ceph are you testing right now? -Greg On Mon, Sep 14, 2015 at 1:02 AM, zhao.mingyue@xxxxxxx <zhao.mingyue@xxxxxxx> wrote: > Thanks for your explaination~ > you mean the origin design leads to this result?In that network situation,flapping must appear and can not terminate except that someone(such as manager) force marking osd out,isn't it? > I also want to ask about whether ceph community has a plan to solve this problem or not? > > Thanks~ > > -----邮件原件----- > 发件人: Chen, Xiaoxi [mailto:xiaoxi.chen@xxxxxxxxx] > 发送时间: 2015年9月14日 15:21 > 收件人: huang jun; zhaomingyue 09440 (RD) > 抄送: ceph-devel@xxxxxxxxxxxxxxx > 主题: RE: 答复: 2 replications,flapping can not stop for a very long time > > This is kind of unsolvable problem, in CAP , we choose Consistency and Availability, thus we had to lose Partition tolerance. > > There are three networks here , mon<-> osd, osd<-public->osd, osd<- cluster-> osd. If some of the networks are reachable but some are not, likely the flipping will happen. > > -----Original Message----- > From: ceph-devel-owner@xxxxxxxxxxxxxxx [mailto:ceph-devel-owner@xxxxxxxxxxxxxxx] On Behalf Of huang jun > Sent: Sunday, September 13, 2015 5:46 PM > To: zhao.mingyue@xxxxxxx > Cc: ceph-devel@xxxxxxxxxxxxxxx > Subject: Re: 答复: 2 replications,flapping can not stop for a very long time > > 2015-09-13 14:07 GMT+08:00 zhao.mingyue@xxxxxxx <zhao.mingyue@xxxxxxx>: >> hi, do you set both public_network and cluster_network, but just cut off the cluster_network? >> And do you have not only one osd on the same host? >> =============================yes,public network+cluster network,and I >> cut off the cluster network; 2 node ,each node has serveral osds; >> >> If so, maybe you can not get stable, now the osd have peers in the prev and next osd id, they can exchange ping message. >> you cut off the cluster_network, the outbox peer osds can not detect the ping, they reports the osd failure to MON, and MON gather enough reporters and reports, then the osd will be marked down. >> =============================when osd recv a new map and it is marked down,it think MON wrongly mark me down,what will it do,join the cluster again or other actions?can you give me some more detailed explanation? > > It will send a boot message to MON, and will be marked UP by MON. >> >> But the osd can reports to MON bc the public_network is ok, MON thinks the osd wronly marked down, mark it to UP. >> =============================you mean that MON recv message ONE TIME from this osd then it will mark this osd up? >> > >> So flapping happens again and again. >> ============================= I tried 3 replications,(public network + >> cluster network,3 node,each node has serveral osds),although it will >> occur flapping,but after serveral minutes it will be stable, compared with 2 replications situation, I wait for the same intervals,the cluster can not be stable; so I'm confused about the machnism that how monitor can decide which osd is actually down? >> > It's weird, if you cut off the cluster_network, the osds in other node can not get the ping messages, and naturally think the osd is failed. > >> thanks >> >> -----邮件原件----- >> 发件人: huang jun [mailto:hjwsm1989@xxxxxxxxx] >> 发送时间: 2015年9月13日 10:39 >> 收件人: zhaomingyue 09440 (RD) >> 抄送: ceph-devel@xxxxxxxxxxxxxxx >> 主题: Re: 2 replications,flapping can not stop for a very long time >> >> hi, do you set both public_network and cluster_network, but just cut off the cluster_network? >> And do you have not only one osd on the same host? >> If so, maybe you can not get stable, now the osd have peers in the prev and next osd id, they can exchange ping message. >> you cut off the cluster_network, the outbox peer osds can not detect the ping, they reports the osd failure to MON, and MON gather enough reporters and reports, then the osd will be marked down. >> But the osd can reports to MON bc the public_network is ok, MON thinks the osd wronly marked down, mark it to UP. >> So flapping happens again and again. >> >> 2015-09-12 20:26 GMT+08:00 zhao.mingyue@xxxxxxx <zhao.mingyue@xxxxxxx>: >>> >>> Hi, >>> I'm testing reliability of ceph recently, and I have met the flapping problem. >>> I have 2 replications, and cut off the cluster network ,now flapping can not stop,I have wait more than 30min, but status of osds are still not stable; >>> I want to know about when monitor recv reports from osds ,how it can mark one osd down? >>> (reports && reporter && grace) need to satisfied some conditions, how to calculate the grace? >>> and how long will the flapping stop?Does the flapping must be stopped by configure,such as configure an osd lost? >>> Can someone help me ? >>> Thanks~ >>> --------------------------------------------------------------------- >>> - >>> --------------------------------------------------------------- >>> 本邮件及其附件含有杭州华三通信技术有限公司的保密信息,仅限于发送给上面地址中列出 >>> 的个人或群组。禁止任何其他人以任何形式使用(包括但不限于全部或部分地泄露、复制、 >>> 或散发)本邮件中的信息。如果您错收了本邮件,请您立即电话或邮件通知发件人并删除本 >>> 邮件! >>> This e-mail and its attachments contain confidential information from >>> H3C, which is intended only for the person or entity whose address is >>> listed above. Any use of the information contained herein in any way >>> (including, but not limited to, total or partial disclosure, >>> reproduction, or dissemination) by persons other than the intended >>> recipient(s) is prohibited. If you receive this e-mail in error, >>> please notify the sender by phone or email immediately and delete it! >> >> >> >> -- >> thanks >> huangjun > > > > -- > thanks > huangjun > -- > To unsubscribe from this list: send the line "unsubscribe ceph-devel" in the body of a message to majordomo@xxxxxxxxxxxxxxx More majordomo info at http://vger.kernel.org/majordomo-info.html -- To unsubscribe from this list: send the line "unsubscribe ceph-devel" in the body of a message to majordomo@xxxxxxxxxxxxxxx More majordomo info at http://vger.kernel.org/majordomo-info.html