Re: High sys+irq usage on CPU0 with interrupts rerouted

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



Steve Fink wrote:
> I have a quad-core Xeon with 4 gigabit NICs. I'm sending gobs of data
> out three of them (about 93% of capacity). One on machine, identical
> to the other, I have a problem where CPU0 is 0% idle while the other 3
> cores are over 60% idle. My application mostly avoids CPU0, and there
> is very little user time being spent on CPU0. It is almost all irq
> (about 25%) and sys (about 50%).
> 
> On the other machine, there's a little bit more sys and irq time on
> CPU0, but everything's pretty well balanced.
> 
> If I kill irqbalance and manually set smp_affinity to avoid CPU0,
> nothing much changes. /proc/interrupts shows a fair number of
> interrupts going to CPU0 before I make the change, and none after
> (other than "LOC" interrupts), but the CPU0 load doesn't change much
> at all.
> 
> There are hardly any packets coming into those three NICs; almost
> everything is a set of (several hundred) MPEG streams going out.
> 
> I guess my main question is: if /proc/interrupts doesn't show any
> interrupts going to CPU0, then why is it spending any time on
> interrupt handling?
> 
> A related question: does NAPI apply to sending packets as well as
> receiving them? How can I tell whether (1) my drivers support NAPI and
> (2) it is active?
> 
> I've tried playing with systemtap a bit to try to figure out what that
> CPU0 time is going to, but I don't really know how to use it well
> enough.
> 
> Here's a snapshot of 'atop' output showing things when they aren't
> quite as extreme (CPU0 is 56% sys, 14% irq, 11% user). This is with
> all the interrupts shunted away from CPU0, or at least, all the ones
> that would let me.
> 
> PRC | sys   9.73s | user   9.55s | #proc    839 | #zombie    0 | #exit    182 |
> CPU | sys     78% | user     96% | irq      46% | idle    170% | wait     10% |
> cpu | sys     56% | user     11% | irq      14% | idle     19% | cpu000 w  0% |
> cpu | sys      8% | user     32% | irq       7% | idle     51% | cpu001 w  2% |
> cpu | sys      7% | user     27% | irq      12% | idle     49% | cpu003 w  5% |
> cpu | sys      7% | user     26% | irq      13% | idle     51% | cpu002 w  3% |
> CPL | avg1  20.95 | avg5   20.11 | avg15  20.09 | csw   233459 | intr  204338 |
> MEM | tot    3.5G | free  214.6M | cache   1.3G | buff   41.5M | slab   38.2M |
> SWP | tot    2.0G | free    2.0G |              | vmcom   3.2G | vmlim   3.7G |
> DSK |         sda | busy     22% | read       0 | write    138 | avio   15 ms |
> DSK |         sdb | busy     21% | read       0 | write    138 | avio   15 ms |
> NET | transport   | tcpi   45226 | tcpo   34604 | udpi     608 | udpo     273 |
> NET | network     | ipi    45834 | ipo  2667927 | ipfrw      0 | deliv  45834 |
> NET | eth2    96% | pcki       2 | pcko  883468 | si    0 Kbps | so  962 Mbps |
> NET | eth3    95% | pcki       1 | pcko  876330 | si    0 Kbps | so  954 Mbps |
> NET | eth1    95% | pcki       1 | pcko  872871 | si    0 Kbps | so  951 Mbps |
> NET | lo     ---- | pcki   22099 | pcko   22099 | si 1532 Kbps | so 1532 Kbps |
> NET | eth0     0% | pcki   23426 | pcko   12763 | si 2948 Kbps | so 1557 Kbps |
> 
> # uname -r
> 2.6.18-8.el5.tvh.3
> (preemption is enabled)
> 
> # rpm -q centos-release
> centos-release-5-0.0.el5.centos.2
> 
> # egrep 'proc|model name' /proc/cpuinfo
> processor       : 0
> model name      : Intel(R) Xeon(R) CPU           E5472  @ 3.00GHz
> processor       : 1
> model name      : Intel(R) Xeon(R) CPU           E5472  @ 3.00GHz
> processor       : 2
> model name      : Intel(R) Xeon(R) CPU           E5472  @ 3.00GHz
> processor       : 3
> model name      : Intel(R) Xeon(R) CPU           E5472  @ 3.00GHz
> 
> # lspci | fgrep Eth
> 03:00.0 Ethernet controller: Broadcom Corporation Unknown device 165a
> 04:00.0 Ethernet controller: Broadcom Corporation Unknown device 165a
> 09:00.0 Ethernet controller: Intel Corporation 82571EB Gigabit
> Ethernet Controller (rev 06)
> 09:00.1 Ethernet controller: Intel Corporation 82571EB Gigabit
> Ethernet Controller (rev 06)
> 
> # ethtool -i eth1
> driver: tg3
> version: 3.65-rh
> firmware-version: 5722-v3.07
> bus-info: 0000:04:00.0
> # ethtool -i eth2
> driver: e1000
> version: 7.2.7-k2-NAPI
> firmware-version: 5.11-2
> bus-info: 0000:09:00.0
> # ethtool -i eth3
> driver: e1000
> version: 7.2.7-k2-NAPI
> firmware-version: 5.11-2
> bus-info: 0000:09:00.1
> --
> To unsubscribe from this list: send the line "unsubscribe linux-net" in
> the body of a message to majordomo@xxxxxxxxxxxxxxx
> More majordomo info at  http://vger.kernel.org/majordomo-info.html
> 
> 

My 1 cent here ...

I would read this well written documentation,

	http://irqbalance.org/documentation.php

it sheds light on many different questions.

In addition, stop your irqbalance daemon, and execute the
following in the foreground:

	irqbalance --debug

I would also play around with irqbalance's environment variables,

       IRQBALANCE_BANNED_CPUS
       IRQBALANCE_ONESHOT
       IRQBALANCE_BANNED_INTERRUPTS

to achieve your goals.




NAPI is also called "RX polling", because it uses a mixture of
polling and interrupts to process incoming network frames.

Also, for the e1000 driver, I would look at your kernel's documentation:

	/usr/src/linux/Documentation/networking/e1000.txt

You will notice the e1000 driver has NAPI (RX polling) enabled by
default, as shown by your ethtool commands above.

As far as the tg3 driver, going through the source code for
tg3, /usr/src/linux/drivers/net/tg3.c, you will notice the RX
polling related code for NAPI.



I hope this helps some .. :)
--
To unsubscribe from this list: send the line "unsubscribe linux-net" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at  http://vger.kernel.org/majordomo-info.html

[Index of Archives]     [Netdev]     [Ethernet Bridging]     [Linux 802.1Q VLAN]     [Linux Wireless]     [Kernel Newbies]     [Security]     [Linux for Hams]     [Netfilter]     [Git]     [Bugtraq]     [Yosemite News and Information]     [MIPS Linux]     [ARM Linux]     [Linux RAID]     [Linux PCI]     [Linux Admin]     [Samba]

  Powered by Linux