Hi, http://tracker.ceph.com/issues/17411 has been created Thanks, Jeegn 2016-09-27 10:00 GMT+08:00 huang jun <hjwsm1989@xxxxxxxxx>: > hi, you can create an issue on website: > http://tracker.ceph.com/projects/ceph/issues > we have notice some osd_tp timeout suicide situation, maybe have the > same reason with your post. > > 2016-09-27 9:47 GMT+08:00 Jeegn Chen <jeegnchen@xxxxxxxxx>: >> Hi all, >> >> I noticed a deadlock phenomenon in Hammer OSD (Strictly speaking, >> starvation related to locks may be more precise since there is no lock >> circle). >> Although I found the threads locking each other, I have not figured >> out why they came to the state. Any hint will be appreciate. >> >> As you see in the following inspection through gdb >> * thread 11 has taken osd_lock but is waiting for heartbeat_lock >> * thread 37 has taken heartbeat_lock but is waiting for a PipeConnection's lock >> * thread 286 has taken the PipeConnection's lock needed by thread 37, >> but it never returns from Pipe::tcp_read_nonblocking() to >> Pipe::read_message(); As a result, the PipeConnection's lock is never >> released by Pipe::read_message() >> * Since osd_lock and heartbeat_lock are reserved, most of the OSD's >> lock cannot move ahead and looks deadlock >> >> More detailed check on thread 286 shows that,EAGAIN error is always >> received when recv() is invoked in Pipe::do_recv(). Thus, >> Pipe::do_recv() logic retries forever, which is why >> Pipe::tcp_read_nonblocking() cannot finish. >> >> I examined the Pipe object with gdb and know it is a connection from >> 192.168.30.41:63805 to 192.168.30.105:6805. So I used "lsof" and >> "netstat" to check both servers (See the result following the gdb >> content). It seems that 192.168.30.41:63805->192.168.30.105:6805 is >> still ESTABILISHED on 192.168.30.41 but the TCP connection does not >> exist in 192.168.30.105's perspective. >> My understanding is that even if a connection is half-closed on one >> side, recv() on the other side should return 0 and give no error. But >> why does recv() give EAGAIN constantly? >> >> Any idea on how this happen or any explanation? Some bug in kernel TCP >> stack (2.6.32)? >> >> >> (gdb) thread 11 >> [Switching to thread 11 (Thread 0x7fcebcd2f700 (LWP 29061))]#0 >> 0x00000032a260e334 in __lll_lock_wait () from /lib64/libpthread.so.0 >> (gdb) bt >> #0 0x00000032a260e334 in __lll_lock_wait () from /lib64/libpthread.so.0 >> #1 0x00000032a26095d8 in _L_lock_854 () from /lib64/libpthread.so.0 >> #2 0x00000032a26094a7 in pthread_mutex_lock () from /lib64/libpthread.so.0 >> #3 0x0000000000c0ba3b in Mutex::Lock (this=0x5bb0ce8, >> no_lockdep=Unhandled dwarf expression opcode 0xf3) at >> common/Mutex.cc:91 >> #4 0x0000000000795c29 in Locker (this=0x5bb0000) at common/Mutex.h:115 >> #5 OSD::maybe_update_heartbeat_peers (this=0x5bb0000) at osd/OSD.cc:3432 >> #6 0x00000000007bad13 in OSD::tick (this=0x5bb0000) at osd/OSD.cc:3863 >> #7 0x00000000007cb299 in Context::complete (this=0x5988a60, >> r=Unhandled dwarf expression opcode 0xf3) at include/Context.h:65 >> #8 0x0000000000c21bbf in SafeTimer::timer_thread (this=0x5bb0070) at >> common/Timer.cc:105 >> #9 0x0000000000c2317d in SafeTimerThread::entry (this=Unhandled dwarf >> expression opcode 0xf3) at common/Timer.cc:38 >> #10 0x00000032a2607aa1 in start_thread () from /lib64/libpthread.so.0 >> #11 0x00000032a22e893d in clone () from /lib64/libc.so.6 >> (gdb) frame 8 >> #8 0x0000000000c21bbf in SafeTimer::timer_thread (this=0x5bb0070) at >> common/Timer.cc:105 >> 105 callback->complete(0); >> (gdb) p &lock >> $1 = (Mutex *) 0x5bb0018 >> (gdb) frame 5 >> #5 OSD::maybe_update_heartbeat_peers (this=0x5bb0000) at osd/OSD.cc:3432 >> 3432 Mutex::Locker l(heartbeat_lock); >> (gdb) p &heartbeat_lock >> $2 = (Mutex *) 0x5bb0ce8 >> >> (gdb) thread 37 >> [Switching to thread 37 (Thread 0x7fceaa4e4700 (LWP 29763))]#0 >> 0x00000032a260e334 in __lll_lock_wait () from /lib64/libpthread.so.0 >> (gdb) bt >> #0 0x00000032a260e334 in __lll_lock_wait () from /lib64/libpthread.so.0 >> #1 0x00000032a26095d8 in _L_lock_854 () from /lib64/libpthread.so.0 >> #2 0x00000032a26094a7 in pthread_mutex_lock () from /lib64/libpthread.so.0 >> #3 0x0000000000c0ba3b in Mutex::Lock (this=0x49c38b18, >> no_lockdep=Unhandled dwarf expression opcode 0xf3 >> ) at common/Mutex.cc:91 >> #4 0x0000000000ddeecb in Locker (this=0x49c38b00, p=0x7fceaa4e2c60) >> at common/Mutex.h:115 >> #5 PipeConnection::try_get_pipe (this=0x49c38b00, p=0x7fceaa4e2c60) >> at msg/simple/PipeConnection.cc:38 >> #6 0x0000000000db52dc in SimpleMessenger::submit_message >> (this=0x5aa8e00, m=0x2c6d0b40, con=0x49c38b00, dest_addr=..., >> dest_type=4, already_locked=false) >> at msg/simple/SimpleMessenger.cc:443 >> #7 0x0000000000db5da7 in SimpleMessenger::_send_message >> (this=0x5aa8e00, m=0x2c6d0b40, con=0x49c38b00) at >> msg/simple/SimpleMessenger.cc:136 >> #8 0x00000000007a27e0 in OSDService::send_incremental_map >> (this=0x5bb16c8, since=58626, con=0x49c38b00, osdmap=Unhandled dwarf >> expression opcode 0xf3 >> ) at osd/OSD.cc:1087 >> #9 0x00000000007a2a89 in OSDService::share_map_peer (this=0x5bb16c8, >> peer=28, con=0x49c38b00, map=std::tr1::shared_ptr (count 562) >> 0x5b367500) at osd/OSD.cc:887 >> #10 0x00000000007a8324 in OSD::handle_osd_ping (this=0x5bb0000, >> m=0x6861f600) at osd/OSD.cc:3599 >> #11 0x00000000007a8b2b in OSD::heartbeat_dispatch (this=0x5bb0000, >> m=0x6861f600) at osd/OSD.cc:5254 >> #12 0x0000000000de07ab in ms_deliver_dispatch (this=0x5aa9dc8) at >> msg/Messenger.h:567 >> #13 DispatchQueue::entry (this=0x5aa9dc8) at msg/simple/DispatchQueue.cc:185 >> #14 0x0000000000db86dd in DispatchQueue::DispatchThread::entry >> (this=Unhandled dwarf expression opcode 0xf3 >> ) at msg/simple/DispatchQueue.h:103 >> #15 0x00000032a2607aa1 in start_thread () from /lib64/libpthread.so.0 >> #16 0x00000032a22e893d in clone () from /lib64/libc.so.6 >> (gdb) frame 10 >> #10 0x00000000007a8324 in OSD::handle_osd_ping (this=0x5bb0000, >> m=0x6861f600) at osd/OSD.cc:3599 >> 3599 service.share_map_peer(from, con.get()); >> (gdb) p &heartbeat_lock >> $6 = (Mutex *) 0x5bb0ce8 >> (gdb) frame 5 >> #5 PipeConnection::try_get_pipe (this=0x49c38b00, p=0x7fceaa4e2c60) >> at msg/simple/PipeConnection.cc:38 >> 38 Mutex::Locker l(lock); >> (gdb) p &lock >> $8 = (Mutex *) 0x49c38b18 >> (gdb) >> (gdb) thread 286 >> [Switching to thread 286 (Thread 0x7fce7a9d9700 (LWP 15873))]#0 >> 0x00000032a260ec2c in recv () from /lib64/libpthread.so.0 >> (gdb) bt >> #0 0x00000032a260ec2c in recv () from /lib64/libpthread.so.0 >> #1 0x0000000000dc8724 in recv (this=0x52fec800, >> buf=0xe27c000 >> "\235\302óH[\002\220R1\035\251\334,-(\353L\rJ\247\232i\210._j\001\375\357\351U\211⤸}\322}8\250\\\374\325W\020f\237\037\314\300Ty\251-\307%\277\ni\205\213\031\245\024\314҂j\223\025\207\212Z`4\340E5!\016\006\227\064\320h\334*\224\204\061\037r\017ʍ\325\f/\202G\257\065&j%\"\222\027\064\204\322n\244\315K\220\354.i\255\206\004z\320M&j\\\200\256x␚|\343\r\237_\347Q\023I\261\216\315Y\202M\361\217Q\301\252\224\370$\330\370\354\334T\261\226\263E%!\245pC\262)\r%\031\244\061\262(u*{\325B\n\222\017QW*+\230\262\067\216\335~\224"..., >> len=4096, flags=64) at /usr/include/bits/socket2.h:45 >> #2 Pipe::do_recv (this=0x52fec800, >> buf=0xe27c000 >> "\235\302óH[\002\220R1\035\251\334,-(\353L\rJ\247\232i\210._j\001\375\357\351U\211⤸}\322}8\250\\\374\325W\020f\237\037\314\300Ty\251-\307%\277\ni\205\213\031\245\024\314҂j\223\025\207\212Z`4\340E5!\016\006\227\064\320h\334*\224\204\061\037r\017ʍ\325\f/\202G\257\065&j%\"\222\027\064\204\322n\244\315K\220\354.i\255\206\004z\320M&j\\\200\256x␚|\343\r\237_\347Q\023I\261\216\315Y\202M\361\217Q\301\252\224\370$\330\370\354\334T\261\226\263E%!\245pC\262)\r%\031\244\061\262(u*{\325B\n\222\017QW*+\230\262\067\216\335~\224"..., >> len=4096, flags=64) at msg/simple/Pipe.cc:2430 >> #3 0x0000000000dc8990 in Pipe::buffered_recv (this=0x52fec800, >> buf=0x4d29e3e1 "", len=3823, flags=Unhandled dwarf expression opcode >> 0xf3 >> ) at msg/simple/Pipe.cc:2476 >> #4 0x0000000000dc8a90 in Pipe::tcp_read_nonblocking (this=0x52fec800, >> buf=Unhandled dwarf expression opcode 0xf3 >> ) at msg/simple/Pipe.cc:2494 >> #5 0x0000000000dc9b36 in Pipe::read_message (this=0x52fec800, >> pm=0x7fce7a9d8b98, auth_handler=0x5d22ef60) at msg/simple/Pipe.cc:2034 >> #6 0x0000000000ddbcc7 in Pipe::reader (this=0x52fec800) at >> msg/simple/Pipe.cc:1581 >> #7 0x0000000000dded6d in Pipe::Reader::entry (this=Unhandled dwarf >> expression opcode 0xf3 >> ) at msg/simple/Pipe.h:50 >> #8 0x00000032a2607aa1 in start_thread () from /lib64/libpthread.so.0 >> #9 0x00000032a22e893d in clone () from /lib64/libc.so.6 >> (gdb) frame 5 >> #5 0x0000000000dc9b36 in Pipe::read_message (this=0x52fec800, >> pm=0x7fce7a9d8b98, auth_handler=0x5d22ef60) at msg/simple/Pipe.cc:2034 >> 2034 int got = tcp_read_nonblocking(bp.c_str(), read); >> (gdb) p connection_state >> $3 = {px = 0x49c38b00} >> (gdb) p &((PipeConnection*)0x49c38b00)->lock >> $5 = (Mutex *) 0x49c38b18 >> (gdb) p this->port >> $11 = 63805 >> (gdb) p this->peer_type >> $12 = 4 >> (gdb) printf "%d.%d.%d.%d\n", >> this->peer_addr.addr4.sin_addr.s_addr&0xFF, >> this->peer_addr.addr4.sin_addr.s_addr>>8&0xFF , >> this->peer_addr.addr4.sin_addr.s_addr>>16&0xFF, >> this->peer_addr.addr4.sin_addr.s_addr>>24&0xFF >> 192.168.30.105 >> (gdb) printf "%d\n", (this->peer_addr.addr4.sin_port<<8&0xFF00) | >> (this->peer_addr.addr4.sin_port>>8&0x00FF) >> 6805 >> >> >> [root@192.168.30.41 ~]# lsof -i :63805 >> COMMAND PID USER FD TYPE DEVICE SIZE/OFF NODE NAME >> ceph-osd 29007 root 647u IPv4 150861496 0t0 TCP >> 192.168.30.41:63805->192.168.30.105:6805 (ESTABLISHED) >> >> [root@192.168.30.105 ~]# netstat -nap | grep 63805 >> [root@192.168.30.105 ~]# >> -- >> To unsubscribe from this list: send the line "unsubscribe ceph-devel" in >> the body of a message to majordomo@xxxxxxxxxxxxxxx >> More majordomo info at http://vger.kernel.org/majordomo-info.html > > > > -- > Thank you! > HuangJun -- To unsubscribe from this list: send the line "unsubscribe ceph-devel" in the body of a message to majordomo@xxxxxxxxxxxxxxx More majordomo info at http://vger.kernel.org/majordomo-info.html