Hi list, I'm trying to solve a problem with 3sec timeouts with tcp connections,we are seeing these timeouts in both production and test environments. The timeouts mainly occur from our webservers connecting to our mysql backend. (about 450 webservers, connecting to somewhat 400 database servers, serving 13 million pageviews per hour during busy hours). After a lot of debugging I can reproduce the problem via the c application you can see below, it only measures the time needed to do a connect as you can see below. #include <stdio.h> #include <stdlib.h> #include <unistd.h> #include <errno.h> #include <string.h> #include <netdb.h> #include <sys/types.h> #include <netinet/in.h> #include <sys/socket.h> #define PORT 3306 int main(int argc, char *argv[]) { int sockfd, t; struct hostent *he; struct sockaddr_in conn_addr; struct timeval tb, te; if ((he = gethostbyname(argv[1])) == NULL) { perror("gethostbyname"); exit(1); } for (t = 0; t < 1000; t++) { if ((sockfd = socket(AF_INET, SOCK_STREAM, 0)) == -1) { perror("socket"); exit(1); } conn_addr.sin_family = AF_INET; conn_addr.sin_port = htons(PORT); conn_addr.sin_addr = *((struct in_addr *)he->h_addr); memset(&(conn_addr.sin_zero), '\0', 8); gettimeofday(&tb, NULL); if (connect(sockfd, (struct sockaddr *)&conn_addr, sizeof(struct sockaddr)) == -1) { perror("connect"); exit(1); } gettimeofday(&te, NULL); shutdown(sockfd, SHUT_RDWR); close(sockfd); double micro_sec_begin = (tb.tv_sec * 1000000) + tb.tv_usec; double micro_sec_end = (te.tv_sec * 1000000) + te.tv_usec; float elapsed = (micro_sec_end - micro_sec_begin) / 1000; printf("time %d: %.2f ms\n\n", t, elapsed); } return 0; } Sample output: time 938: 0.02 ms time 939: 3003.88 ms time 940: 0.04 ms I've run this application in production (busy and idle servers) and different ports, still saw some 3sec timeouts, about 3 till 5 every 1000 connections. Then I tested it on my own laptop via the lo interface. Same result, about 3 a 5 timeouts per 1000 tests. Then I noticed the large amount of TIME_WAIT states and wondered if there could be relation. I also tried this on a openbsd machine, with the same timeouts, so I figure this is a tcp/ip related thing. Could anybody help me out with this problem, is there a /proc/sys/net/ipv4/ setting to tune this 3sec timeout? We are running 2.6.18, 2.6.22 and 2.6.24 kernels all with x86_64 hardware, using linux virtual server loadbalacing for some parts of our serverpark. sysctl we use standard net.core.rmem_max = 16777216 net.core.wmem_max = 16777216 net.core.netdev_max_backlog = 50000 net.ipv4.tcp_rmem = 4096 87380 16777216 net.ipv4.tcp_wmem = 4096 65536 16777216 net.ipv4.tcp_max_tw_buckets = 2000000 net.ipv4.tcp_max_syn_backlog = 30000 net.ipv4.tcp_no_metrics_save = 1 kernel.shmmax=512000000 I've tried tuning via tcp_tw_reuse tcp_tw_recycle tcp_max_orphans tcp_max_tw_buckets ip_local_port_range but all without any result, Any help is highly appreciated. Kind regards Marlon de Boer System engineer for www.hyves.nl -- To unsubscribe from this list: send the line "unsubscribe linux-net" in the body of a message to majordomo@xxxxxxxxxxxxxxx More majordomo info at http://vger.kernel.org/majordomo-info.html