[chunkd patch 1/6] Fix the leak of suddenly closed connections

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



After a period of uptime, chunkd may stop working with this:

May 20 08:51:47 azdragon2 chunkd[4034]: tcp accept: Too many open files

An examination with lsof shows that file descriptors for sockets and
object data files are leaked in neat pairs. As it turns out, the root
cause is not processing the case when tabled opens a connection to
read an object, then closes it before the data is transferred.
On some systems, sendfile returns no error in such case, but the
amount of data that it attempted to send before it recognized that
the socket was closed. If that happens, chunkd will not receive a
POLLOUT indication and the struct cli will linger forever with
non-empty write queue.

The fix has two parts:

 1. Permit a client in evt_recycle state to process outstanding
    writes in the same manner a client in evt_dispose does.

    Note that in our specific failure case no actual processing
    is going to occur, so this part has an effect of permitting
    the dispatch to work. If we do not do this, a POLLIN may
    throw us into the evt_read_fixed stage.

 2. Once we're getting dispatched, dispose of clients that
    had connections closed, using the unmaskable POLLHUP bit.

As an aside, tabled 0.5-0.7.x resets the connections when Firefox
asks for a file that was modified after a certain date. In that case,
tabled wants to know when the file was modified, so it reads the
header off chunkd. If it turns out that the client is not interested
in the data, tabled simply closes the connection without reading
whatever data has arrived. This may change in the future, but the
bug in chunkd should be fixed anyway, for general robustness.

Signed-off-by: Pete Zaitcev <zaitcev@xxxxxxxxxx>

---
 server/server.c |   13 +++++++++++++
 1 file changed, 13 insertions(+)

commit a217892610de6c38453b2f63605880de43ec54af
Author: Master <zaitcev@xxxxxxxxxxxxxxxxxx>
Date:   Thu May 20 21:19:48 2010 -0600

    Fix the leak of suddenly closed connections.

diff --git a/server/server.c b/server/server.c
index a2dc656..07d0375 100644
--- a/server/server.c
+++ b/server/server.c
@@ -399,6 +399,13 @@ static bool cli_evt_dispose(struct client *cli, unsigned int events)
 
 static bool cli_evt_recycle(struct client *cli, unsigned int events)
 {
+
+	/* if write queue is not empty, we should continue to get
+	 * poll callbacks here until it is
+	 */
+	if (!list_empty(&cli->write_q))
+		return false;
+
 	cli->req_ptr = &cli->creq;
 	cli->req_used = 0;
 	cli->state = evt_read_fixed;
@@ -1303,6 +1310,12 @@ static bool tcp_cli_event(int fd, short events, void *userdata)
 	struct client *cli = userdata;
 	bool loop = false, disposing = false;
 
+	if (events & POLLHUP) {
+		cli->state = evt_dispose;
+		cli_free(cli);
+		return true;
+	}
+
 	if (events & POLLOUT)
 		tcp_cli_wr_event(fd, events & ~POLLIN, userdata);
 
--
To unsubscribe from this list: send the line "unsubscribe hail-devel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at  http://vger.kernel.org/majordomo-info.html

[Index of Archives]     [Fedora Clound]     [Linux USB Devel]     [Linux Audio Users]     [Yosemite News]     [Linux Kernel]     [Linux SCSI]     [XFree86]

  Powered by Linux