I have a 16 core x86_64 machine (4 chips x 4 cores/chip) that has 4
Intel 82571EB Gb NICs (2 pci cards x 2 NICs/chip) using the e1000 driver.
I have a simple client/server micro-benchmark that pounds a server on
each NIC with requests to measure peak throughput. I am running Ubuntu
8.04.1, kernel version 2.6.24.
What I am observing is that a single ksoftirqd thread is becoming a
bottleneck for the system.
More specifically, one cpu runs ksoftirqd at 100% cpu utilization, while
4 cpus each run their servers at about 25%. I carefully used
sched_setaffinity() to map server threads to cpus and
/proc/irq/<device>/smp_affinity to map hardware interrupts to cpus such
that there should be exactly 1 cpu per server thread and 1 cpu for
servicing hardware interrupts per device.
I can observe (via /proc/interrupts) that the interrupts are being
distributed properly, but despite this I only see 1 or 2 ksoftirqd
running, and the server daemons bottlenecked behind them. (This is with
NAPI disabled. With NAPI enabled, I can't get even 2 ksoftirqd threads
to run). I have tried varous permutations such as assigning each
hardware interrupt to a different physical chip.
Desired Result:
It seems to me that with 4 independent NICs and plenty of CPUs to spare,
I ought to be able to assign one softirq daemon to each NIC rather than
funnelling all of the traffic through 1 or 2.
Any advice on this issue is greatly appreciated.
Best regards,
Don Porter
To unsubscribe from this list: send the line "unsubscribe linux-net" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html