Unsafe TCP connection close handling

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



Yesterday during an ffsb run on the ceph kernel client, both
the client and the osd processes hit the max open fds limit
(there was only one osd up at the time). There were 1006 sockets
in the CLOSING state on the client, and 1006 in the FIN_WAIT2
state on the osd.

From the tcp state machine [1], it seems that the
sequence of events was something like this, with both sides
initially in the ESTABLISHED state:

      Kernel Client            OSD
           |                    |
           |                   /| Send FIN, go to FIN_WAIT1
Send FIN,  |\                 / |
go to      | \               /  |
FIN_WAIT1  |  \             /   |
           |   \           /    |
Recv FIN   |<--------------     |
           |     \              |
Send ACK,  |------\------------>| Recv ACK, go to FIN_WAIT2
go to      |       \            |
CLOSING    |        -----------x| FIN not read

That is, after closing its half of the connection, the osd isn't
reading anything from the socket anymore, and thus ignores the
FIN from the client. We have bug #1803 to track this, but we
should make sure libceph in the kernel handles simultaneous
TCP connection close correctly as well.

[1] http://www.tcpipguide.com/free/diagrams/tcpfsm.png
--
To unsubscribe from this list: send the line "unsubscribe ceph-devel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[Index of Archives]     [CEPH Users]     [Ceph Large]     [Information on CEPH]     [Linux BTRFS]     [Linux USB Devel]     [Video for Linux]     [Linux Audio Users]     [Yosemite News]     [Linux Kernel]     [Linux SCSI]
  Powered by Linux