Well - I think it might be that some of my servlets weren't closing their database connections properly. I do have some new evidence though: I did an strace of the tomcat processes, and I noticed something that might be odd, but I'm not really qualified to say. I notice that every time a socket sends a request to Postgresql it gets some kind of reply. This is true in all cases EXCEPT when the application crashes. Here is the segment of the strace right before it throws a wobbly: [pid 4565] socket(PF_INET, SOCK_STREAM, IPPROTO_IP) = 156 [pid 4565] bind(156, {sa_family=AF_INET, sin_port=htons(0), sin_addr=inet_addr("0.0.0.0")}, 16) = 0 [pid 4565] getsockname(156, {sa_family=AF_INET, sin_port=htons(56550), sin_addr=inet_addr("0.0.0.0")}, [16]) = 0 [pid 4565] connect(156, {sa_family=AF_INET, sin_port=htons(5432), sin_addr=inet_addr("127.0.0.1")}, 16) = 0 [pid 4565] setsockopt(156, SOL_TCP, TCP_NODELAY, [1], 4) = 0 [pid 4565] send(156, "\0\0\0W\0\3\0\0user\0postgres\0database\0t"..., 87, 0) = 87 [pid 4565] recv(156, "R\0\0\0\10\0\0\0\0S\0\0\0\34client_encoding\0UN"..., 8192, 0) = 279 [pid 4565] gettimeofday({1204948966, 386187}, NULL) = 0 [pid 4565] send(156, "P\0\0\1\35\0\r\n \t\tselect"..., 334, 0) = 334 [pid 4565] recv(156, "", 8192, 0) = 0 [pid 4565] send(156, "X\0\0\0\4", 5, 0) = 5 [pid 4565] dup2(11, 156) = 156 [pid 4565] close(156) = 0 Notice that the recv(156,... after sending the query comes back blank which seems odd given that we just sent a query to the database. I'm really in bind with this one. It started happening a couple of days ago at this point, and all our admin applications are basically down :(, people can't even log the bugs that this is generating because the bugtrac (trac) is running on this postgresql and is throwing errors too. I also caught something else that seemed wierd on another trace: [pid 3553] send(28, "P\0\0\0H\0delete from result_cache w"..., 108, 0) = 108 [pid 3553] recv(28, "N\0\0\1\202SWARNING\0C57P02\0Mterminatin"..., 8192, 0) = 387 [pid 3553] gettimeofday({1204946902, 977641}, NULL) = 0 [pid 3553] gettimeofday({1204946902, 977682}, NULL) = 0 [pid 3553] gettimeofday({1204946902, 977766}, NULL) = 0 [pid 3553] gettimeofday({1204946902, 977902}, NULL) = 0 [pid 3553] gettimeofday({1204946902, 977973}, NULL) = 0 [pid 3553] gettimeofday({1204946902, 978012}, NULL) = 0 [pid 3553] gettimeofday({1204946902, 978053}, NULL) = 0 [pid 3553] gettimeofday({1204946902, 978091}, NULL) = 0 [pid 3553] recv(28, "", 8192, 0) = 0 [pid 3553] send(28, "X\0\0\0\4", 5, 0) = -1 EPIPE (Broken pipe) [pid 3553] --- SIGPIPE (Broken pipe) @ 0 (0) --- [pid 3553] rt_sigreturn(0x9) = -1 EPIPE (Broken pipe) I couldn't reproduce this though. It just randomly throws a SIGPIPE after the query. The other wierd thing is that this process also throws a SIGSEGV at another point. I wasn't expecting tomcat to crash, so alas I didn't capture a core file. I guess I should set the system default up. Alex On Fri, Mar 7, 2008 at 2:28 PM, Scott Marlowe <scott.marlowe@xxxxxxxxx> wrote: > On Fri, Mar 7, 2008 at 11:17 AM, Alex Turner <armtuk@xxxxxxxxx> wrote: > > I didn't. And after the reboot, I still see 8 new sockets stuck in > > CLOSE_WAIT - I'm wondering if this is a hardware/kernel problem... > > Having sockets in CLOSE_WAIT is actually pretty normal > -- Sent via pgsql-general mailing list (pgsql-general@xxxxxxxxxxxxxx) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-general