Hi all, I'm trying to learn the networking code of an ancient 2.4.18 vanilla kernel. Nevertheless when I guess to understand the theory I've got problems when I going to put it in practice. Here I go with the theory. - The egress path of a packet in few words: 1. Coming from upper layers the egress path of a packet begins with the function dev_queue_xmit(skb). This function is used by higher protocols to send a packet in the form of a socket buffer, over a network device. The socket buffer is placed in the output queue of the network device. This is done by use of the meth2. The next step is to invoke the method qdisc_run(dev). This method only call qdisc_restart(dev) until there are not more packets in the output queue or until the network device does not accept any more packets: while (!netif_queue_stopped(dev) && qdisc_restart(dev)<0) /* NOTHING */; There is one special case: the device has not defined methods for queue management (dev->enqueue == NULL), in this case the packet goes directly to dev->hard_start_xmit() function, a device dependent trasmission function. This case concerns logical network devices, such as loopback device. 3. In the no special case: qdisc_restart(dev) is the responsible for getting the next packet from the queue of network device, using a special strategy (qdisc) and sending it by means of hard_start_xmit(). At this point before reach the function hard_start_xmit(), it is possible to run into problems: A. The network device is currently unable to send packets, whether netif_queue_stopped(dev) test is true. B. The lock dev->xmit_lock is set. This spinlock is normally set when the transmission of a packet is to be started in qdisc_restart(). So another CPU is owner of the lock because it is sending another packet. In both cases the socket buffer is placed back into the queue and finally, the NET_TX_SOFTRIQ is raised in the method netif_schedule(), with cpu_raise_softirq(cpu, NET_TX_SOFTIRQ). 4. When do_softirq() is invoked, the function net_tx_action() asociated with NET_TX_SOFIRQ is called. net_tx_action() is the responsible for invoking again qdisc_run(). In short: - Functions path for normal transmission process: dev_queue_xmit(skb)->qdisc_run(dev)->qdisc_restart(dev)->hard_start_xmit() - Functions path for sofirq transmission process: dev_queue_xmit(skb)-> qdisc_run(dev)-> qdisc_restart(dev)-> netif_shedule(dev)->cpu_raise_softirq()->net_tx_action()->qdisc_run(dev) OK, that is the theory. In order to test the theory I have coded several points of control (counters) with the following patch: -------------- BEGIN PATCH ------------------ diff -uprN -X dontdiff linux-2.4.18/include/linux/netdevice.h linux-2.4.18_softirq/include/linux/netdevice.h --- linux-2.4.18/include/linux/netdevice.h 2006-11-21 10:26:05.000000000 +0100 +++ linux-2.4.18_softirq/include/linux/netdevice.h 2007-01-03 15:01:23.000000000 +0100 @@ -480,7 +480,16 @@ struct softnet_data struct sk_buff *completion_queue; } __attribute__((__aligned__(SMP_CACHE_BYTES))); +extern struct softirq_counters { + unsigned int raise_from_netif_cont; + unsigned int raise_from_kfree_skb_cont; + unsigned int hard_start_xmit_cont; + unsigned int xmit_lock_grabbed_cont; + unsigned int tx_completion_cont; + unsigned int tx_output_cont; +}; +extern struct softirq_counters softirq_stats; extern struct softnet_data softnet_data[NR_CPUS]; #define HAVE_NETIF_QUEUE @@ -490,7 +499,10 @@ static inline void __netif_schedule(stru if (!test_and_set_bit(__LINK_STATE_SCHED, &dev->state)) { unsigned long flags; int cpu = smp_processor_id(); - + +#ifdef CONFIG_PROC_FS + softirq_stats.raise_from_netif_cont++; +#endif local_irq_save(flags); dev->next_sched = softnet_data[cpu].output_queue; softnet_data[cpu].output_queue = dev; @@ -540,6 +552,9 @@ static inline void dev_kfree_skb_irq(str int cpu =smp_processor_id(); unsigned long flags; +#ifdef CONFIG_PROC_FS + softirq_stats.raise_from_kfree_skb_cont++; +#endif local_irq_save(flags); skb->next = softnet_data[cpu].completion_queue; softnet_data[cpu].completion_queue = skb; diff -uprN -X dontdiff linux-2.4.18/net/core/dev.c linux-2.4.18_softirq/net/core/dev.c --- linux-2.4.18/net/core/dev.c 2002-02-25 20:38:14.000000000 +0100 +++ linux-2.4.18_softirq/net/core/dev.c 2007-01-03 13:04:40.000000000 +0100 @@ -107,6 +107,7 @@ extern int plip_init(void); #endif +struct softirq_counters softirq_stats; /* This define, if set, will randomly drop a packet when congestion * is more than moderate. It helps fairness in the multi-interface @@ -1329,6 +1330,9 @@ static void net_tx_action(struct softirq if (softnet_data[cpu].completion_queue) { struct sk_buff *clist; +#ifdef CONFIG_PROC_FS + softirq_stats.tx_completion_cont++; +#endif local_irq_disable(); clist = softnet_data[cpu].completion_queue; softnet_data[cpu].completion_queue = NULL; @@ -1346,6 +1350,9 @@ static void net_tx_action(struct softirq if (softnet_data[cpu].output_queue) { struct net_device *head; +#ifdef CONFIG_PROC_FS + softirq_stats.tx_output_cont++; +#endif local_irq_disable(); head = softnet_data[cpu].output_queue; softnet_data[cpu].output_queue = NULL; @@ -1793,6 +1800,29 @@ static int dev_proc_stats(char *buffer, return len; } +static int softirq_proc_stats(char *buffer, char **start, off_t offset, + int length, int *eof, void *data) +{ + int len = 0; + + len += sprintf(buffer+len, "raise_from_netif_cont: %d\n" + "raise_from_kfree_skb_cont: %d\n" + "(qdisc_restart) hard_start_xmit_cont: %d\n" + "(qdisc_restart) xmit_lock_grabbed_cont: %d\n" + "tx_completion_cont: %d\n" + "tx_output_cont: %d\n", + softirq_stats.raise_from_netif_cont, + softirq_stats.raise_from_kfree_skb_cont, + softirq_stats.hard_start_xmit_cont, + softirq_stats.xmit_lock_grabbed_cont, + softirq_stats.tx_completion_cont, + softirq_stats.tx_output_cont); + + *eof = 1; + + return len; +} + #endif /* CONFIG_PROC_FS */ @@ -2855,6 +2885,7 @@ int __init net_dev_init(void) #ifdef CONFIG_PROC_FS proc_net_create("dev", 0, dev_get_info); create_proc_read_entry("net/softnet_stat", 0, 0, dev_proc_stats, NULL); + create_proc_read_entry("net/softirq_stat", 0, 0, softirq_proc_stats, NULL); #ifdef WIRELESS_EXT proc_net_create("wireless", 0, dev_get_wireless_info); #endif /* WIRELESS_EXT */ diff -uprN -X dontdiff linux-2.4.18/net/sched/sch_generic.c linux-2.4.18_softirq/net/sched/sch_generic.c --- linux-2.4.18/net/sched/sch_generic.c 2000-08-18 19:26:25.000000000 +0200 +++ linux-2.4.18_softirq/net/sched/sch_generic.c 2007-01-03 12:59:53.000000000 +0100 @@ -97,6 +97,9 @@ int qdisc_restart(struct net_device *dev spin_unlock(&dev->xmit_lock); spin_lock(&dev->queue_lock); +#ifdef CONFIG_PROC_FS + softirq_stats.hard_start_xmit_cont++; +#endif return -1; } } @@ -114,6 +117,9 @@ int qdisc_restart(struct net_device *dev it by checking xmit owner and drop the packet when deadloop is detected. */ +#ifdef CONFIG_PROC_FS + softirq_stats.xmit_lock_grabbed_cont++; +#endif if (dev->xmit_lock_owner == smp_processor_id()) { kfree_skb(skb); if (net_ratelimit()) -------------- END PATCH ------------------ When I send a lot of UDP packets with very high ratio I obtain this values: # cat /proc/net/softirq_stat raise_from_netif_cont: 0 raise_from_kfree_skb_cont: 0 (qdisc_restart) hard_start_xmit_cont: 4338 (qdisc_restart) xmit_lock_grabbed_cont: 0 tx_completion_cont: 1718 tx_output_cont: 160 Here go my problem: tx_completion_cont and tx_output_cont are counters within net_tx_action. This function is called (ready to execution) only when is invoked cpu_raise_sofirq. The only two points where this funcition is invoked are: 1. In include/linux/netdevice.h -> __netif_schedule 2. In include/linux/netdevice.h -> dev_kfree_skb_irq So the counters: raise_from_netif_cont and raise_from_kfree_skb_cont never must be zero!!!! Anybody can tell me what's the problem? What am I doing wrong? I'm sorry for my english :-( -- Javi Roman. -- Kernelnewbies: Help each other learn about the Linux kernel. Archive: http://mail.nl.linux.org/kernelnewbies/ FAQ: http://kernelnewbies.org/faq/