Re: repeating "wrong node" and ceph-mgr CPU usage get higher and higher without any I/O

Jerry Lee <leisurelysw24@xxxxxxxxx> · Thu, 27 Apr 2017 10:43:37 +0800

On 26 April 2017 at 21:26, Sage Weil <sweil@xxxxxxxxxx> wrote:
> On Wed, 26 Apr 2017, Jerry Lee wrote:
>> Hi,
>>
>> I got a issue that the ceph-mgr consumes more and more CPU computing
>> power even when there's no I/O in my cluster.  Some logs and strace
>> output are provided as follows.  From the logs, it seems that the
>> "wrong node" log starts appearing when a node leave and rejoin the
>> cluster.  The question is why does ceph-mgr keeps trying the old
>> connections even after it knows the new (correct) nonce already? And
>> the number of trial connections is also increasing as time go on.  The
>> only way to mitigate the issue is to restart the ceph-mgr process, but
>> after several hours, the issue occurs again.  Is there any suggestion
>> or idea that I could do to solve the issue?  Thanks!
>>
>>
>> environment:
>> =========
>> linux-4.2.8
>> ceph version v11.0.2
>
> Are you sure that's the version you're running?  That's a random
> pre-kraken development release.
>

Yes, I found kraken 11.0.2 released in Oct. 2016, and I just gave it a
try.  Is there any recommended version for kraken?  Or, I should use
Jewel instead?

> That said, even if its 11.2.0, we've fixed a ton of mgr issues in the
> run-up to luminous, so I'm not surprised if you've seen problems on
> kraken.  The easy workaround is to just stop the mgr daemon--it doesn't
> really do anything on kraken yet!
>
> sage

I will stop the mgr daemon and keep monitoring the cluster status.
Thanks for your quick reply!

Jerry

>
>
>> 8 nodes and 8 8TB disks on each nodes.
>>    - 3 MONs
>>    - 3 MGRs
>>    - 3 MDSs
>>
>> parts of ceph-mgr log:
>> ================
>> 2017-04-25 12:27:23.298417 7f2b74d31700  0 -- 192.168.2.8:6849/3516 >>
>> 192.168.2.7:0/3459516139 conn(0x55b915d77800 :6849 s=STATE_OPEN pgs=2
>> cs=1 l=0).fault initiating reconnect
>> 2017-04-25 12:27:23.480121 7f2b74530700  0 -- 192.168.2.8:6849/3516 >>
>> 192.168.2.7:0/1237847161 conn(0x55b915d79000 :6849 s=STATE_OPEN pgs=2
>> cs=1 l=0).fault initiating reconnect
>> 2017-04-25 12:27:23.614977 7f2b75532700  0 -- 192.168.2.8:6849/3516 >>
>> 192.168.2.7:0/2251622435 conn(0x55b915d7a800 :6849 s=STATE_OPEN pgs=2
>> cs=1 l=0).fault initiating reconnect
>> ...
>> 2017-04-25 14:21:54.632838 7f2b74530700  0 -- 192.168.2.8:6849/3516 >>
>> 192.168.2.6:6824/3898 conn(0x55b914c7a000 :6849
>> s=STATE_CONNECTING_WAIT_BANNER_AND_IDENTIFY pgs=2 cs=2
>> l=0)._process_connection connect claims to be 192.168.2.6:6824/3683
>> not 192.168.2.6:6824/3898 - wrong node!
>> 2017-04-25 14:21:54.824814 7f2b74530700  0 -- 192.168.2.8:6849/3516 >>
>> 192.168.2.6:6812/3552 conn(0x55b914c4e000 :6849
>> s=STATE_CONNECTING_WAIT_BANNER_AND_IDENTIFY pgs=2 cs=2
>> l=0)._process_connection connect claims to be 192.168.2.6:6812/3467
>> not 192.168.2.6:6812/3552 - wrong node!
>> 2017-04-25 14:21:55.240852 7f2b75532700  0 -- 192.168.2.8:6849/3516 >>
>> 192.168.2.6:6836/4430 conn(0x55b914413000 :6849
>> s=STATE_CONNECTING_WAIT_BANNER_AND_IDENTIFY pgs=2 cs=2
>> l=0)._process_connection connect claims to be 192.168.2.6:6836/3978
>> not 192.168.2.6:6836/4430 - wrong node!
>>
>> strace output:
>> ==========
>> [BEFORE]
>> # timeout 10 ./strace -p 3518 -c
>> ./strace: Process 3518 attached
>> ./strace: Process 3518 detached
>> % time     seconds  usecs/call     calls    errors syscall
>> ------ ----------- ----------- --------- --------- ----------------
>>  96.91    0.282674          15     19033           epoll_wait
>>   0.95    0.002783           2      1662         7 futex
>>   0.58    0.001682           2       902       892 connect
>>   0.54    0.001581           4       381        78 read
>>   0.27    0.000774           1       912           epoll_ctl
>>   0.19    0.000563           7        79           sendmsg
>>   0.14    0.000409           1       329           close
>>   0.14    0.000398           1       329           socket
>>   0.12    0.000364           1       658           fcntl
>>   0.08    0.000247           1       329       319 shutdown
>>   0.08    0.000224           1       329           setsockopt
>> ------ ----------- ----------- --------- --------- ----------------
>> 100.00    0.291699                 24943      1296 total
>>
>> [AFTER about 1 day]
>> # timeout 10 ./strace -p 3518 -c
>> ./strace: Process 3518 attached
>> ./strace: Process 3518 detached
>> % time     seconds  usecs/call     calls    errors syscall
>> ------ ----------- ----------- --------- --------- ----------------
>>  99.91    0.403715           2    199589           epoll_wait
>>   0.05    0.000218           0     18739     18739 connect
>>   0.02    0.000075           0     32176       169 futex
>>   0.01    0.000024           0      6486           socket
>>   0.00    0.000020           0     12972           fcntl
>>   0.00    0.000018           0     18739           epoll_ctl
>>   0.00    0.000010           0      6486           setsockopt
>>   0.00    0.000006           0      6486           close
>>   0.00    0.000005           0      6486      6486 shutdown
>>   0.00    0.000000           0       304        51 read
>>   0.00    0.000000           0        51           sendmsg
>> ------ ----------- ----------- --------- --------- ----------------
>> 100.00    0.404091                308514     25445 total
>>
>> detailed strace result on a ceph-mgr:
>> ==========================
>> socket(AF_INET, SOCK_STREAM, IPPROTO_IP) = 94
>> fcntl(94, F_GETFL)                      = 0x2 (flags O_RDWR)
>> fcntl(94, F_SETFL, O_RDWR|O_NONBLOCK)   = 0
>> setsockopt(94, SOL_TCP, TCP_NODELAY, [1], 4) = 0
>> connect(94, {sa_family=AF_INET, sin_port=htons(0),
>> sin_addr=inet_addr("192.168.2.7")}, 16) = -1 EINPROGRESS (Operation
>> now in progress)
>> epoll_ctl(4, EPOLL_CTL_ADD, 94, {EPOLLIN|EPOLLET, {u32=94, u64=94}}) = 0
>> connect(94, {sa_family=AF_INET, sin_port=htons(0),
>> sin_addr=inet_addr("192.168.2.7")}, 16) = -1 EALREADY (Operation
>> already in progress)
>> epoll_ctl(4, EPOLL_CTL_MOD, 94, {EPOLLIN|EPOLLOUT|EPOLLET, {u32=94,
>> u64=94}}) = 0
>> epoll_wait(4, [{EPOLLIN|EPOLLOUT|EPOLLERR|EPOLLHUP, {u32=95, u64=95}},
>> {EPOLLIN|EPOLLOUT|EPOLLERR|EPOLLHUP, {u32=93, u64=93}}], 5000, 4) = 2
>> connect(95, {sa_family=AF_INET, sin_port=htons(0),
>> sin_addr=inet_addr("192.168.2.2")}, 16) = -1 ECONNREFUSED (Connection
>> refused)
>> --
>> To unsubscribe from this list: send the line "unsubscribe ceph-devel" in
>> the body of a message to majordomo@xxxxxxxxxxxxxxx
>> More majordomo info at  http://vger.kernel.org/majordomo-info.html
>>
>>
--
To unsubscribe from this list: send the line "unsubscribe ceph-devel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at  http://vger.kernel.org/majordomo-info.html