On Tue, Mar 25, 2014 at 9:24 AM, Travis Rhoden <trhoden@xxxxxxxxx> wrote: > Okay, last one until I get some guidance. Sorry for the spam, but wanted to > paint a full picture. Here are debug logs from all three mons, capturing > what looks like an election sequence to me: > > ceph0: > 2014-03-25 16:17:24.324846 7fa5c53fc700 5 mon.ceph0@0(electing).elector(35) > start -- can i be leader? > 2014-03-25 16:17:24.324900 7fa5c53fc700 1 mon.ceph0@0(electing).elector(35) > init, last seen epoch 35 > 2014-03-25 16:17:24.324913 7fa5c53fc700 1 -- 10.10.30.0:6789/0 --> mon.1 > 10.10.30.1:6789/0 -- election(b3f38955-4321-4850-9ddb-3b09940dc951 propose > 35) v4 -- ?+0 0x263d480 > 2014-03-25 16:17:24.324948 7fa5c53fc700 1 -- 10.10.30.0:6789/0 --> mon.2 > 10.10.30.2:6789/0 -- election(b3f38955-4321-4850-9ddb-3b09940dc951 propose > 35) v4 -- ?+0 0x263d6c0 > 2014-03-25 16:17:25.353975 7fa5c4bfb700 1 -- 10.10.30.0:6789/0 <== mon.2 > 10.10.30.2:6789/0 493 ==== election(b3f38955-4321-4850-9ddb-3b09940dc951 > propose 35) v4 ==== 537+0+0 (4036841703 0 0) 0x265fd80 con 0x1df0c60 > 2014-03-25 16:17:25.354042 7fa5c4bfb700 5 mon.ceph0@0(electing).elector(35) > handle_propose from mon.2 > 2014-03-25 16:17:29.325107 7fa5c53fc700 5 mon.ceph0@0(electing).elector(35) > election timer expired > > ceph1: > 2014-03-25 16:17:24.325529 7ffe48cc1700 5 mon.ceph1@1(electing).elector(35) > handle_propose from mon.0 > 2014-03-25 16:17:24.325535 7ffe48cc1700 5 mon.ceph1@1(electing).elector(35) > defer to 0 > 2014-03-25 16:17:24.325546 7ffe48cc1700 1 -- 10.10.30.1:6789/0 --> mon.0 > 10.10.30.0:6789/0 -- election(b3f38955-4321-4850-9ddb-3b09940dc951 ack 35) > v4 -- ?+0 0x1bbfb40 > 2014-03-25 16:17:25.354038 7ffe48cc1700 1 -- 10.10.30.1:6789/0 <== mon.2 > 10.10.30.2:6789/0 489 ==== election(b3f38955-4321-4850-9ddb-3b09940dc951 > propose 35) v4 ==== 537+0+0 (4036841703 0 0) 0x1bbf6c0 con 0x14d9b00 > 2014-03-25 16:17:25.354102 7ffe48cc1700 5 mon.ceph1@1(electing).elector(35) > handle_propose from mon.2 > 2014-03-25 16:17:25.354113 7ffe48cc1700 5 mon.ceph1@1(electing).elector(35) > no, we already acked 0 > > ceph2: > 2014-03-25 16:17:20.353135 7f80d0013700 5 mon.ceph2@2(electing).elector(35) > election timer expired > 2014-03-25 16:17:20.353154 7f80d0013700 5 mon.ceph2@2(electing).elector(35) > start -- can i be leader? > 2014-03-25 16:17:20.353225 7f80d0013700 1 mon.ceph2@2(electing).elector(35) > init, last seen epoch 35 > 2014-03-25 16:17:20.353238 7f80d0013700 1 -- 10.10.30.2:6789/0 --> mon.0 > 10.10.30.0:6789/0 -- election(b3f38955-4321-4850-9ddb-3b09940dc951 propose > 35) v4 -- ?+0 0x18e7900 > 2014-03-25 16:17:20.353272 7f80d0013700 1 -- 10.10.30.2:6789/0 --> mon.1 > 10.10.30.1:6789/0 -- election(b3f38955-4321-4850-9ddb-3b09940dc951 propose > 35) v4 -- ?+0 0x18e7d80 > 2014-03-25 16:17:25.353559 7f80d0013700 5 mon.ceph2@2(electing).elector(35) > election timer expired > 2014-03-25 16:17:25.353578 7f80d0013700 5 mon.ceph2@2(electing).elector(35) > start -- can i be leader? > 2014-03-25 16:17:25.353647 7f80d0013700 1 mon.ceph2@2(electing).elector(35) > init, last seen epoch 35 > 2014-03-25 16:17:25.353660 7f80d0013700 1 -- 10.10.30.2:6789/0 --> mon.0 > 10.10.30.0:6789/0 -- election(b3f38955-4321-4850-9ddb-3b09940dc951 propose > 35) v4 -- ?+0 0x19b7240 > 2014-03-25 16:17:25.353695 7f80d0013700 1 -- 10.10.30.2:6789/0 --> mon.1 > 10.10.30.1:6789/0 -- election(b3f38955-4321-4850-9ddb-3b09940dc951 propose > 35) v4 -- ?+0 0x19b76c0 > 2014-03-25 16:17:30.354040 7f80d0013700 5 mon.ceph2@2(electing).elector(35) > election timer expired > > Oddly, it looks to me like mon.2 (ceph2) never handles/receives the proposal > from mon.0 (ceph0). But I admit I have no clue how monitor election works. Likewise, mon.1 sends an ack to mon.0 that is never received. I think you've got a busted firewall that's allowing some one-way communication, but not two-way. -Greg Software Engineer #42 @ http://inktank.com | http://ceph.com _______________________________________________ ceph-users mailing list ceph-users@xxxxxxxxxxxxxx http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com