Re: repeating "wrong node" and ceph-mgr CPU usage get higher and higher without any I/O

Sage Weil <sweil@xxxxxxxxxx> · Wed, 26 Apr 2017 13:26:42 +0000 (UTC)

On Wed, 26 Apr 2017, Jerry Lee wrote:
> Hi,
> 
> I got a issue that the ceph-mgr consumes more and more CPU computing
> power even when there's no I/O in my cluster.  Some logs and strace
> output are provided as follows.  From the logs, it seems that the
> "wrong node" log starts appearing when a node leave and rejoin the
> cluster.  The question is why does ceph-mgr keeps trying the old
> connections even after it knows the new (correct) nonce already? And
> the number of trial connections is also increasing as time go on.  The
> only way to mitigate the issue is to restart the ceph-mgr process, but
> after several hours, the issue occurs again.  Is there any suggestion
> or idea that I could do to solve the issue?  Thanks!
> 
> 
> environment:
> =========
> linux-4.2.8
> ceph version v11.0.2

Are you sure that's the version you're running?  That's a random 
pre-kraken development release.

That said, even if its 11.2.0, we've fixed a ton of mgr issues in the 
run-up to luminous, so I'm not surprised if you've seen problems on 
kraken.  The easy workaround is to just stop the mgr daemon--it doesn't 
really do anything on kraken yet!

sage

> 8 nodes and 8 8TB disks on each nodes.
>    - 3 MONs
>    - 3 MGRs
>    - 3 MDSs
> 
> parts of ceph-mgr log:
> ================
> 2017-04-25 12:27:23.298417 7f2b74d31700  0 -- 192.168.2.8:6849/3516 >>
> 192.168.2.7:0/3459516139 conn(0x55b915d77800 :6849 s=STATE_OPEN pgs=2
> cs=1 l=0).fault initiating reconnect
> 2017-04-25 12:27:23.480121 7f2b74530700  0 -- 192.168.2.8:6849/3516 >>
> 192.168.2.7:0/1237847161 conn(0x55b915d79000 :6849 s=STATE_OPEN pgs=2
> cs=1 l=0).fault initiating reconnect
> 2017-04-25 12:27:23.614977 7f2b75532700  0 -- 192.168.2.8:6849/3516 >>
> 192.168.2.7:0/2251622435 conn(0x55b915d7a800 :6849 s=STATE_OPEN pgs=2
> cs=1 l=0).fault initiating reconnect
> ...
> 2017-04-25 14:21:54.632838 7f2b74530700  0 -- 192.168.2.8:6849/3516 >>
> 192.168.2.6:6824/3898 conn(0x55b914c7a000 :6849
> s=STATE_CONNECTING_WAIT_BANNER_AND_IDENTIFY pgs=2 cs=2
> l=0)._process_connection connect claims to be 192.168.2.6:6824/3683
> not 192.168.2.6:6824/3898 - wrong node!
> 2017-04-25 14:21:54.824814 7f2b74530700  0 -- 192.168.2.8:6849/3516 >>
> 192.168.2.6:6812/3552 conn(0x55b914c4e000 :6849
> s=STATE_CONNECTING_WAIT_BANNER_AND_IDENTIFY pgs=2 cs=2
> l=0)._process_connection connect claims to be 192.168.2.6:6812/3467
> not 192.168.2.6:6812/3552 - wrong node!
> 2017-04-25 14:21:55.240852 7f2b75532700  0 -- 192.168.2.8:6849/3516 >>
> 192.168.2.6:6836/4430 conn(0x55b914413000 :6849
> s=STATE_CONNECTING_WAIT_BANNER_AND_IDENTIFY pgs=2 cs=2
> l=0)._process_connection connect claims to be 192.168.2.6:6836/3978
> not 192.168.2.6:6836/4430 - wrong node!
> 
> strace output:
> ==========
> [BEFORE]
> # timeout 10 ./strace -p 3518 -c
> ./strace: Process 3518 attached
> ./strace: Process 3518 detached
> % time     seconds  usecs/call     calls    errors syscall
> ------ ----------- ----------- --------- --------- ----------------
>  96.91    0.282674          15     19033           epoll_wait
>   0.95    0.002783           2      1662         7 futex
>   0.58    0.001682           2       902       892 connect
>   0.54    0.001581           4       381        78 read
>   0.27    0.000774           1       912           epoll_ctl
>   0.19    0.000563           7        79           sendmsg
>   0.14    0.000409           1       329           close
>   0.14    0.000398           1       329           socket
>   0.12    0.000364           1       658           fcntl
>   0.08    0.000247           1       329       319 shutdown
>   0.08    0.000224           1       329           setsockopt
> ------ ----------- ----------- --------- --------- ----------------
> 100.00    0.291699                 24943      1296 total
> 
> [AFTER about 1 day]
> # timeout 10 ./strace -p 3518 -c
> ./strace: Process 3518 attached
> ./strace: Process 3518 detached
> % time     seconds  usecs/call     calls    errors syscall
> ------ ----------- ----------- --------- --------- ----------------
>  99.91    0.403715           2    199589           epoll_wait
>   0.05    0.000218           0     18739     18739 connect
>   0.02    0.000075           0     32176       169 futex
>   0.01    0.000024           0      6486           socket
>   0.00    0.000020           0     12972           fcntl
>   0.00    0.000018           0     18739           epoll_ctl
>   0.00    0.000010           0      6486           setsockopt
>   0.00    0.000006           0      6486           close
>   0.00    0.000005           0      6486      6486 shutdown
>   0.00    0.000000           0       304        51 read
>   0.00    0.000000           0        51           sendmsg
> ------ ----------- ----------- --------- --------- ----------------
> 100.00    0.404091                308514     25445 total
> 
> detailed strace result on a ceph-mgr:
> ==========================
> socket(AF_INET, SOCK_STREAM, IPPROTO_IP) = 94
> fcntl(94, F_GETFL)                      = 0x2 (flags O_RDWR)
> fcntl(94, F_SETFL, O_RDWR|O_NONBLOCK)   = 0
> setsockopt(94, SOL_TCP, TCP_NODELAY, [1], 4) = 0
> connect(94, {sa_family=AF_INET, sin_port=htons(0),
> sin_addr=inet_addr("192.168.2.7")}, 16) = -1 EINPROGRESS (Operation
> now in progress)
> epoll_ctl(4, EPOLL_CTL_ADD, 94, {EPOLLIN|EPOLLET, {u32=94, u64=94}}) = 0
> connect(94, {sa_family=AF_INET, sin_port=htons(0),
> sin_addr=inet_addr("192.168.2.7")}, 16) = -1 EALREADY (Operation
> already in progress)
> epoll_ctl(4, EPOLL_CTL_MOD, 94, {EPOLLIN|EPOLLOUT|EPOLLET, {u32=94,
> u64=94}}) = 0
> epoll_wait(4, [{EPOLLIN|EPOLLOUT|EPOLLERR|EPOLLHUP, {u32=95, u64=95}},
> {EPOLLIN|EPOLLOUT|EPOLLERR|EPOLLHUP, {u32=93, u64=93}}], 5000, 4) = 2
> connect(95, {sa_family=AF_INET, sin_port=htons(0),
> sin_addr=inet_addr("192.168.2.2")}, 16) = -1 ECONNREFUSED (Connection
> refused)
> --
> To unsubscribe from this list: send the line "unsubscribe ceph-devel" in
> the body of a message to majordomo@xxxxxxxxxxxxxxx
> More majordomo info at  http://vger.kernel.org/majordomo-info.html
> 
> 
--
To unsubscribe from this list: send the line "unsubscribe ceph-devel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at  http://vger.kernel.org/majordomo-info.html