Well since I spammed the list earlier, I should fess up to my mistakes. I forgot to change MTU sizes on the 10G switch after I switched to jumbo frames. So yes, I had a very unhappy networking stack.
On the upside, playing with Cumulus Linux on switches is fun.On Tue, Mar 25, 2014 at 1:12 PM, Travis Rhoden <trhoden@xxxxxxxxx> wrote:
Thanks for the feedback -- I'll post back with more detailed logs if anything looks fishy!On Tue, Mar 25, 2014 at 1:10 PM, Gregory Farnum <greg@xxxxxxxxxxx> wrote:
Well, you could try running with messenger debugging cranked all the
way up and see if there's something odd happening there (eg, not
handling incoming messages), but based on not having any other reports
of this, I think your networking stack is unhappy in some way. *shrug*
(Higher log levels showing what the individual pipes are doing will
narrow it down on the Ceph side.)
On Tue, Mar 25, 2014 at 10:05 AM, Travis Rhoden <trhoden@xxxxxxxxx> wrote:
>
>
>
> On Tue, Mar 25, 2014 at 12:53 PM, Gregory Farnum <greg@xxxxxxxxxxx> wrote:
>>
>> On Tue, Mar 25, 2014 at 9:24 AM, Travis Rhoden <trhoden@xxxxxxxxx> wrote:
>> > Okay, last one until I get some guidance. Sorry for the spam, but
>> > wanted to
>> > paint a full picture. Here are debug logs from all three mons,
>> > capturing
>> > what looks like an election sequence to me:
>> >
>> > ceph0:
>> > 2014-03-25 16:17:24.324846 7fa5c53fc700 5
>> > mon.ceph0@0(electing).elector(35)
>> > start -- can i be leader?
>> > 2014-03-25 16:17:24.324900 7fa5c53fc700 1
>> > mon.ceph0@0(electing).elector(35)
>> > init, last seen epoch 35
>> > 2014-03-25 16:17:24.324913 7fa5c53fc700 1 -- 10.10.30.0:6789/0 -->
>> > mon.1
>> > 10.10.30.1:6789/0 -- election(b3f38955-4321-4850-9ddb-3b09940dc951
>> > propose
>> > 35) v4 -- ?+0 0x263d480
>> > 2014-03-25 16:17:24.324948 7fa5c53fc700 1 -- 10.10.30.0:6789/0 -->
>> > mon.2
>> > 10.10.30.2:6789/0 -- election(b3f38955-4321-4850-9ddb-3b09940dc951
>> > propose
>> > 35) v4 -- ?+0 0x263d6c0
>> > 2014-03-25 16:17:25.353975 7fa5c4bfb700 1 -- 10.10.30.0:6789/0 <==
>> > mon.2
>> > 10.10.30.2:6789/0 493 ==== election(b3f38955-4321-4850-9ddb-3b09940dc951
>> > propose 35) v4 ==== 537+0+0 (4036841703 0 0) 0x265fd80 con 0x1df0c60
>> > 2014-03-25 16:17:25.354042 7fa5c4bfb700 5
>> > mon.ceph0@0(electing).elector(35)
>> > handle_propose from mon.2
>> > 2014-03-25 16:17:29.325107 7fa5c53fc700 5
>> > mon.ceph0@0(electing).elector(35)
>> > election timer expired
>> >
>> > ceph1:
>> > 2014-03-25 16:17:24.325529 7ffe48cc1700 5
>> > mon.ceph1@1(electing).elector(35)
>> > handle_propose from mon.0
>> > 2014-03-25 16:17:24.325535 7ffe48cc1700 5
>> > mon.ceph1@1(electing).elector(35)
>> > defer to 0
>> > 2014-03-25 16:17:24.325546 7ffe48cc1700 1 -- 10.10.30.1:6789/0 -->
>> > mon.0
>> > 10.10.30.0:6789/0 -- election(b3f38955-4321-4850-9ddb-3b09940dc951 ack
>> > 35)
>> > v4 -- ?+0 0x1bbfb40
>> > 2014-03-25 16:17:25.354038 7ffe48cc1700 1 -- 10.10.30.1:6789/0 <==
>> > mon.2
>> > 10.10.30.2:6789/0 489 ==== election(b3f38955-4321-4850-9ddb-3b09940dc951
>> > propose 35) v4 ==== 537+0+0 (4036841703 0 0) 0x1bbf6c0 con 0x14d9b00
>> > 2014-03-25 16:17:25.354102 7ffe48cc1700 5
>> > mon.ceph1@1(electing).elector(35)
>> > handle_propose from mon.2
>> > 2014-03-25 16:17:25.354113 7ffe48cc1700 5
>> > mon.ceph1@1(electing).elector(35)
>> > no, we already acked 0
>> >
>> > ceph2:
>> > 2014-03-25 16:17:20.353135 7f80d0013700 5
>> > mon.ceph2@2(electing).elector(35)
>> > election timer expired
>> > 2014-03-25 16:17:20.353154 7f80d0013700 5
>> > mon.ceph2@2(electing).elector(35)
>> > start -- can i be leader?
>> > 2014-03-25 16:17:20.353225 7f80d0013700 1
>> > mon.ceph2@2(electing).elector(35)
>> > init, last seen epoch 35
>> > 2014-03-25 16:17:20.353238 7f80d0013700 1 -- 10.10.30.2:6789/0 -->
>> > mon.0
>> > 10.10.30.0:6789/0 -- election(b3f38955-4321-4850-9ddb-3b09940dc951
>> > propose
>> > 35) v4 -- ?+0 0x18e7900
>> > 2014-03-25 16:17:20.353272 7f80d0013700 1 -- 10.10.30.2:6789/0 -->
>> > mon.1
>> > 10.10.30.1:6789/0 -- election(b3f38955-4321-4850-9ddb-3b09940dc951
>> > propose
>> > 35) v4 -- ?+0 0x18e7d80
>> > 2014-03-25 16:17:25.353559 7f80d0013700 5
>> > mon.ceph2@2(electing).elector(35)
>> > election timer expired
>> > 2014-03-25 16:17:25.353578 7f80d0013700 5
>> > mon.ceph2@2(electing).elector(35)
>> > start -- can i be leader?
>> > 2014-03-25 16:17:25.353647 7f80d0013700 1
>> > mon.ceph2@2(electing).elector(35)
>> > init, last seen epoch 35
>> > 2014-03-25 16:17:25.353660 7f80d0013700 1 -- 10.10.30.2:6789/0 -->
>> > mon.0
>> > 10.10.30.0:6789/0 -- election(b3f38955-4321-4850-9ddb-3b09940dc951
>> > propose
>> > 35) v4 -- ?+0 0x19b7240
>> > 2014-03-25 16:17:25.353695 7f80d0013700 1 -- 10.10.30.2:6789/0 -->
>> > mon.1
>> > 10.10.30.1:6789/0 -- election(b3f38955-4321-4850-9ddb-3b09940dc951
>> > propose
>> > 35) v4 -- ?+0 0x19b76c0
>> > 2014-03-25 16:17:30.354040 7f80d0013700 5
>> > mon.ceph2@2(electing).elector(35)
>> > election timer expired
>> >
>> > Oddly, it looks to me like mon.2 (ceph2) never handles/receives the
>> > proposal
>> > from mon.0 (ceph0). But I admit I have no clue how monitor election
>> > works.
>>
>> Likewise, mon.1 sends an ack to mon.0 that is never received. I think
>> you've got a busted firewall that's allowing some one-way
>> communication, but not two-way.
>
>
> Hmm, I see that is missing as well. There is no hardware firewall in play,
> and I've check iptables on all nodes -- no firewall rules are in place at
> this time. Just default ACCEPT rules everywhere.
>
> However, I do see some weirdness here:
>
> ceph0:
> # netstat -atln
> Active Internet connections (servers and established)
> Proto Recv-Q Send-Q Local Address Foreign Address State
> tcp 0 0 10.10.30.0:6789 0.0.0.0:* LISTEN
> tcp 0 0 0.0.0.0:22 0.0.0.0:* LISTEN
> tcp 0 0 10.1.30.0:22 10.1.0.1:33129
> ESTABLISHED
> tcp 0 85128 10.10.30.0:40760 10.10.30.2:6789
> ESTABLISHED
> tcp 0 0 10.10.30.0:43920 10.10.30.1:6789
> ESTABLISHED
> tcp6 0 0 :::22 :::* LISTEN
>
> ceph1:
> # netstat -atln
> Active Internet connections (servers and established)
> Proto Recv-Q Send-Q Local Address Foreign Address State
> tcp 0 0 10.10.30.1:6789 0.0.0.0:* LISTEN
> tcp 0 0 0.0.0.0:22 0.0.0.0:* LISTEN
> tcp 0 79643 10.10.30.1:6789 10.10.30.0:43920
> ESTABLISHED
> tcp 0 0 10.10.30.1:41876 10.10.30.2:6789
> ESTABLISHED
> tcp 0 0 10.1.30.1:22 10.1.0.1:38382
> ESTABLISHED
> tcp6 0 0 :::22 :::* LISTEN
>
> ceph2:
> # netstat -atln
> Active Internet connections (servers and established)
> Proto Recv-Q Send-Q Local Address Foreign Address State
> tcp 0 0 10.10.30.2:6789 0.0.0.0:* LISTEN
> tcp 0 0 0.0.0.0:22 0.0.0.0:* LISTEN
> tcp 0 0 10.10.30.2:6789 10.10.30.1:41876
> ESTABLISHED
> tcp 0 0 10.1.30.2:22 10.1.0.1:58195
> ESTABLISHED
> tcp 0 0 10.10.30.2:6789 10.10.30.0:40760
> ESTABLISHED
> tcp6 0 0 :::22 :::* LISTEN
>
> Running those netstat command multiple times shows the Send-Q backed up on
> both ceph0 and ceph1. The number stays the same.
>
> So I see that mon.0 is backed up sending to mon.2, and that mon.1 is backed
> up sending to mon.0. And mon.2 has no backup.
>
> Still baffled.
>
>> -Greg
>>
>> Software Engineer #42 @ http://inktank.com | http://ceph.com
>
>
_______________________________________________ ceph-users mailing list ceph-users@xxxxxxxxxxxxxx http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com