A TCP monitoring /proc/net file

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



After a long delay due to my research schedule here at UCSD, I have made a
patch that creates the
/proc/net/tcphealth file. This file monitors all established TCP connections
and reports some health
metrics on them. See http://heron.ucsd.edu/tcphealth.php for more information.

This patch is for kernel version 2.4.3 and was made with the command 'diff
-Naur pristine-linux-2.4.3 linux-2.4.3'. Patches for other versions are
available upon request.

I believe this patch would make a useful addition to the Linux kernel. Below
are the correspondance with all of you this past spring, with the most recent
first.

Sincerely,
Federico David Sacerdoti,
UCSD CSE department, San Diego CA

-------------
Hello!

> Would a patch for 2.4.2 be helpful?

Yes, of course. The tool is useful not depending on any curcumstances.

Alexey
-------------
On Fri, Mar 23, 2001 at 09:19:14PM +0100, Federico David Sacerdoti wrote:
> The external monitoring made possible by the /proc/net/tcphealth is
> interesting because the SRTT is proportional to the speed of one's
> network connection, and duplicate acks indicate that packets are being
> lost (or reordered, less likely) somewhere in the network.

2.4 has a special state machine to detect reordering when the connection
supports timestamps.

I guess some long term statistics (currently TCP_INFO only dumps current
state) would be useful too, but it's David's call if he want to put in
the few cycles that'll cost (probably only in slow paths anyways)

I guess it would be better if you would put it into the existing TCP_INFO
framework, perhaps with an additional /proc frontend to TCP_INFO.
Having two ways to do a similar thing is not good.

-Andi
--------------
Date: Fri, 23 Mar 2001 12:19:14 -0800
From: Federico David Sacerdoti <fds@cs.ucsd.edu>

The external monitoring made possible by the /proc/net/tcphealth is
interesting because the SRTT is proportional to the speed of one's
network connection, and duplicate acks indicate that packets are being
lost (or reordered, less likely) somewhere in the network.

These are things we want to know about a connection we are
trying to communicate on - its individual latency and how often packets
are being lost over it.

Would a patch for 2.4.2 be helpful?
--------------
On Fri, Mar 23, 2001 at 01:57:11AM +0100, David S. Miller wrote:
>
> See the TCP_INFO socket option we added to 2.4.x

Sadly TCP_INFO can not be used for external monitoring currently
(at least not without very bad and racy hacks to allow /proc to open sockets
in /proc/pid/fd)


-Andi
--------------
Date: Thu, 22 Mar 2001 16:53:44 -0800
From: Federico David Sacerdoti <fds@cs.ucsd.edu>

For a graduate network class at UCSD I implemented some TCP performance
monitors in the Linux TCP stack (ipv4). I have added a file to the proc
filesystem (/proc/net/tcphealth) that monitors the "health" of all tcp
connections on a machine. The tcphealth file tracks smoothed
Round-Trip-Times, duplicate acks, and duplicate incoming packets for
each established tcp connection.

I believe that there is lots of good monitoring information that can be
gleaned from this file. It works on all TCP connections
without the cooperation of the remote server.

In the code I have taken care not to disrupt the fast path in
tcp_rcv_established(), and generally have tried to step lightly. I have
patched kernel versions 2.2.14 and 2.2.16, and tested it on an ix86, a
SUN, and a PowerPC. If there is any interest, I will submit the patch to the
appropriate maintainer.
diff -Naur pristine-linux-2.4.3/Makefile linux-2.4.3/Makefile
--- pristine-linux-2.4.3/Makefile	Thu Aug  2 15:46:19 2001
+++ linux-2.4.3/Makefile	Thu Aug  2 16:06:53 2001
@@ -1,7 +1,7 @@
 VERSION = 2
 PATCHLEVEL = 4
 SUBLEVEL = 3
-EXTRAVERSION =
+EXTRAVERSION = -tcphealth
 
 KERNELRELEASE=$(VERSION).$(PATCHLEVEL).$(SUBLEVEL)$(EXTRAVERSION)
 
diff -Naur pristine-linux-2.4.3/include/net/sock.h linux-2.4.3/include/net/sock.h
--- pristine-linux-2.4.3/include/net/sock.h	Thu Aug  2 15:47:08 2001
+++ linux-2.4.3/include/net/sock.h	Thu Aug  2 16:02:36 2001
@@ -24,6 +24,7 @@
  *		Alan Cox	:	Eliminate low level recv/recvfrom
  *		David S. Miller	:	New socket lookup architecture.
  *              Steve Whitehouse:       Default routines for sock_ops
+ *  Federico David Sacerdoti	:	Added TCP health counters.
  *
  *		This program is free software; you can redistribute it and/or
  *		modify it under the terms of the GNU General Public License
@@ -272,7 +273,8 @@
 		unsigned long timeout;	/* Currently scheduled timeout		*/
 		__u32	lrcvtime;	/* timestamp of last received data packet*/
 		__u16	last_seg_size;	/* Size of last incoming segment	*/
		__u16	rcv_mss;	/* MSS used for delayed ACK decisions	*/ 
+		__u32	last_ack_sent;	/* sequence number of the last ack we sent. */
 	} ack;
 
 	/* Data for direct copy to user */
@@ -411,9 +413,18 @@
 	unsigned int		keepalive_time;	  /* time before keep alive takes place */
 	unsigned int		keepalive_intvl;  /* time interval between keep alive probes */
 	int			linger2;
+
+	/*
+	 *      TCP health monitoring counters.
+	 */
+	 __u32        dup_acks_sent;
+	 __u32        dup_pkts_recv;
+	 __u32        acks_sent;
+	 __u32        pkts_recv;
+
 };
 
- 	
+
 /*
  * This structure really needs to be cleaned up.
  * Most of it is for TCP, and not used by any of
diff -Naur pristine-linux-2.4.3/net/ipv4/af_inet.c linux-2.4.3/net/ipv4/af_inet.c
--- pristine-linux-2.4.3/net/ipv4/af_inet.c	Thu Aug  2 15:47:15 2001
+++ linux-2.4.3/net/ipv4/af_inet.c	Thu Aug  2 16:02:36 2001
@@ -54,6 +54,7 @@
  *					Some other random speedups.
  *		Cyrus Durgin	:	Cleaned up file for kmod hacks.
  *		Andi Kleen	:	Fix inet_stream_connect TCP race.
+ * Federico David Sacerdoti	:	Added tcphealth proc file
  *
  *		This program is free software; you can redistribute it and/or
  *		modify it under the terms of the GNU General Public License
@@ -128,6 +129,7 @@
 extern int afinet_get_info(char *, char **, off_t, int);
 extern int tcp_get_info(char *, char **, off_t, int);
 extern int udp_get_info(char *, char **, off_t, int);
+extern int tcp_health_get_info(char *, char **, off_t, int);
 extern void ip_mc_drop_socket(struct sock *sk);
 
 #ifdef CONFIG_DLCI
@@ -474,7 +476,7 @@
 	 * (ie. your servers still start up even if your ISDN link
 	 *  is temporarily down)
 	 */
-	if (sysctl_ip_nonlocal_bind == 0 && 
+	if (sysctl_ip_nonlocal_bind == 0 &&
 	    sk->protinfo.af_inet.freebind == 0 &&
 	    addr->sin_addr.s_addr != INADDR_ANY &&
 	    chk_addr_ret != RTN_LOCAL &&
@@ -1054,6 +1056,7 @@
 	proc_net_create ("sockstat", 0, afinet_get_info);
 	proc_net_create ("tcp", 0, tcp_get_info);
 	proc_net_create ("udp", 0, udp_get_info);
+	proc_net_create ("tcphealth", 0, tcp_health_get_info);
 #endif		/* CONFIG_PROC_FS */
 	return 0;
 }
diff -Naur pristine-linux-2.4.3/net/ipv4/proc.c linux-2.4.3/net/ipv4/proc.c
--- pristine-linux-2.4.3/net/ipv4/proc.c	Thu Aug  2 15:47:16 2001
+++ linux-2.4.3/net/ipv4/proc.c	Thu Aug  2 16:02:36 2001
@@ -26,6 +26,7 @@
  *	Andi Kleen		:	Add support for open_requests and 
  *					split functions for more readibility.
  *	Andi Kleen		:	Add support for /proc/net/netstat
+ * Federico David Sacerdoti	:	Added support for /proc/net/tcphealth
  *
  *		This program is free software; you can redistribute it and/or
  *		modify it under the terms of the GNU General Public License
@@ -155,7 +156,7 @@
 	if (len > length)
 		len = length;
 	if (len < 0)
-		len = 0; 
+		len = 0;
 	return len;
 }
 
@@ -212,3 +213,97 @@
 		len = 0; 
 	return len;
 }
+
+/*
+ *	Output /proc/net/tcphealth
+ */
+#define LINESZ 128
+
+int tcp_health_get_info(char *buffer, char **start, off_t offset, int length)
+{
+	int len=0, i=0, num=0;
+	off_t pos=0, begin=0;
+       char tmpbuf[LINESZ+1], srcIP[32], destIP[32];
+
+	unsigned long  dest, src, SmoothedRttEstimate,
+		AcksSent, DupAcksSent, PktsRecv, DupPktsRecv;
+	unsigned short destp, srcp;
+
+	len = sprintf(buffer,
+		"TCP Health Monitoring (established connections only)\n"
+		" -Duplicate ACKs indicate lost/reordered packets on the connection.\n"
+		" -Duplicate Packets Received show you should be using SACK (rare).\n"
+		" -RttEst estimates how long a packet takes on a round trip over the connection.\n"
+		"id   Local Address        Remote Address       RttEst(ms) AcksSent "
+		"DupAcksSent PktsRecv DupPktsRecv\n");
+	pos=len;
+
+	/* Loop through established TCP connections */
+	local_bh_disable();
+	for (i=0; i < tcp_ehash_size; i++) {
+		struct tcp_ehash_bucket *head = &tcp_ehash[i];
+		struct sock *sk;
+		struct tcp_opt *tp;
+
+		read_lock(&head->lock);
+		for (sk=head->chain; sk; sk=sk->next) {
+			if (!TCP_INET_FAMILY(sk->family))
+				continue;
+			pos+=LINESZ;
+			if (pos <= offset)
+				continue;
+
+			dest  = ntohl(sk->daddr);
+			src = ntohl(sk->rcv_saddr);
+			destp = ntohs(sk->dport);
+			srcp  = ntohs(sk->sport);
+
+			tp = &(sk->tp_pinfo.af_tcp);
+			SmoothedRttEstimate = (tp->srtt >> 3);
+			AcksSent = tp->acks_sent;
+			DupAcksSent = tp->dup_acks_sent;
+			PktsRecv = tp->pkts_recv;
+			DupPktsRecv = tp->dup_pkts_recv;
+
+                       sprintf(srcIP, "%lu.%lu.%lu.%lu:%u",
+                               ((src >> 24) & 0xFF), ((src >> 16) & 0xFF), ((src >> 8) & 0xFF), (src & 0xFF),
+                               srcp);
+                       sprintf(destIP, "%lu.%lu.%lu.%lu:%u",
+                               ((dest >> 24) & 0xFF), ((dest >> 16) & 0xFF), ((dest >> 8) & 0xFF), (dest & 0xFF),
+                               destp);
+
+                       sprintf(tmpbuf, "%d: %-21s %-21s "
+                               "%8lu %8lu %8lu %8lu %8lu",
+                               num,
+                               srcIP,
+                               destIP,
+                               SmoothedRttEstimate,
+                               AcksSent,
+                               DupAcksSent,
+                               PktsRecv,
+                               DupPktsRecv
+                               );
+
+			len += sprintf(buffer+len, "%-*s\n", LINESZ-1, tmpbuf);
+			if(pos >= offset+length) {
+				read_unlock(&head->lock);
+				goto out;
+			}
+			num++;
+		}
+		read_unlock(&head->lock);
+	}
+
+out:
+	local_bh_enable();
+
+	begin = len - (pos - offset);
+	*start = buffer + begin;
+	len -= begin;
+	if(len>length)
+		len = length;
+	if (len<0)
+		len = 0;
+	return len;
+}
+
diff -Naur pristine-linux-2.4.3/net/ipv4/tcp_input.c linux-2.4.3/net/ipv4/tcp_input.c
--- pristine-linux-2.4.3/net/ipv4/tcp_input.c	Thu Aug  2 15:47:16 2001
+++ linux-2.4.3/net/ipv4/tcp_input.c	Thu Aug  2 16:02:36 2001
@@ -60,6 +60,7 @@
  *		Pasi Sarolahti,
  *		Panu Kuhlberg:		Experimental audit of TCP (re)transmission
  *					engine. Lots of bugs are found.
+ * Federico David Sacerdoti	:	Added TCP health monitoring
  */
 
 #include <linux/config.h>
@@ -2489,6 +2490,8 @@
 		}
 
 		if (!after(TCP_SKB_CB(skb)->end_seq, tp->rcv_nxt)) {
+			/* Course retransmit inefficiency- this packet has been received twice. [tcphealth] */
+			tp->dup_pkts_recv++;
 			SOCK_DEBUG(sk, "ofo packet was already received \n");
 			__skb_unlink(skb, skb->list);
 			__kfree_skb(skb);
@@ -2584,6 +2587,10 @@
 		return;
 	}
 
+	/* A packet is a "duplicate" if it contains bytes we have already received. [tcphealth] */
+	if (before(TCP_SKB_CB(skb)->seq, tp->rcv_nxt))
+		tp->dup_pkts_recv++;
+
 	if (!after(TCP_SKB_CB(skb)->end_seq, tp->rcv_nxt)) {
 		/* A retransmit, 2nd most common case.  Force an immediate ack. */
 		NET_INC_STATS_BH(DelayedACKLost);
@@ -3180,6 +3187,14 @@
 	 */
 
 	tp->saw_tstamp = 0;
+
+	 /*
+	  * Tcp health monitoring is interested in
+	  * total per connection packet arrivals.
+	  * There is no way to avoid putting this in the fast
+	  * path.
+	  */
+	  tp->pkts_recv++;
 
 	/*	pred_flags is 0xS?10 << 16 + snd_wnd
 	 *	if header_predition is to be made
diff -Naur pristine-linux-2.4.3/net/ipv4/tcp_output.c linux-2.4.3/net/ipv4/tcp_output.c
--- pristine-linux-2.4.3/net/ipv4/tcp_output.c	Thu Aug  2 15:47:16 2001
+++ linux-2.4.3/net/ipv4/tcp_output.c	Thu Aug  2 16:05:54 2001
@@ -33,6 +33,7 @@
  *		Andrea Arcangeli:	SYNACK carry ts_recent in tsecr.
  *		Cacophonix Gaul :	draft-minshall-nagle-01
  *		J Hadi Salim	:	ECN support
+ * Federico David Sacerdoti	:	Added TCP health monitoring
  *
  */
 
@@ -1269,9 +1270,16 @@
 		TCP_SKB_CB(buff)->flags = TCPCB_FLAG_ACK;
 		TCP_SKB_CB(buff)->sacked = 0;
 
+		/* If the rcv_nxt has not advanced since sending our last ACK, this is a duplicate. [tcphealth] */
+		if (tp->rcv_nxt == tp->ack.last_ack_sent)
+			tp->dup_acks_sent++;
+		/* Record the total number of acks sent on this connection [tcphealth]. */
+		tp->acks_sent++;
+
 		/* Send it off, this clears delayed acks for us. */
 		TCP_SKB_CB(buff)->seq = TCP_SKB_CB(buff)->end_seq = tcp_acceptable_seq(sk, tp);
 		TCP_SKB_CB(buff)->when = tcp_time_stamp;
+		tp->ack.last_ack_sent = tp->rcv_nxt;
 		tcp_transmit_skb(sk, buff);
 	}
 }

[Index of Archives]     [Netdev]     [Ethernet Bridging]     [Linux 802.1Q VLAN]     [Linux Wireless]     [Kernel Newbies]     [Security]     [Linux for Hams]     [Netfilter]     [Git]     [Bugtraq]     [Yosemite News and Information]     [MIPS Linux]     [ARM Linux]     [Linux RAID]     [Linux PCI]     [Linux Admin]     [Samba]

  Powered by Linux