Hello all, I'm almost certain that is the not the right place to ask this question, but if RedHat/Fedora's kernel engineers can't help me, I'm truly screwed. I'm are using two Intel 10GbE (ixgbe) cards to passively monitor 10GbE lines (Under RHEL 5.2) either using the in-kernel dev_add_pack interface (built-in ixgbe driver) or using a slightly modified ixgbe driver. (built around Intel's latest ixgbe driver) However, I'm experiencing odd performance issues - namely, once I configure the driver to use MSI-X w/ multi-queue [MQ] (forcing pci=msi) and assign each IRQ to one CPU core (irq cpu affinity), my software requires -10x- more CPU cycles (measured using rdtsc; compared to multiple GbE links and/or w/ MSI-X/MQ disabled) to process each packet, causing massive missed IRQs (rx_missed_errors) induced packet loss. Looking at mpstat I can see the each CPU core is handling a fairly low number of interrupts (200-1000) while spending most of its time in softIRQ. (>90%, most likely within my own code) I decided to check newer kernels so I've installed F10 (24C Xeon-MP Intel S7000FC4U) and F9 (16C Opteron DL585G5, *) on two machines, but even with 2.6.27 kernels and I'm experiencing the same performance issues. Given the fact that the same code is used to process packets - no matter what type of links are being used, my first instinct was to look at the CPU cores themselves. (E.g. L1 & L2 dcache miss rates; TLB flushes; etc). I tried using oprofile, but I failed to make it work. On one machine (Xeon-MP, F10), oprofile failed to identify the Dunnington CPU (switching to timer mode) and on the other (Barcelona 8354, F9), even though it was configured to report dcache statistics [1,2] opreport returns empty reports. In-order to verify that oprofile indeed works on Opteron machine, I reconfigured oprofile to report CPU usage [3], but even than, oprofile either returns empty results to hard-locks the machine. So: A. Anyone else seeing the same odd behavior once MSI-X/MQ is enabled on Intel's 10G cards? (P.S. MQ cannot be enabled on both machines unless I add pci=msi to the kernel's command line) B. Any idea why oprofile refuses to generate cache statistics and/or what did I do wrong? C. Before I dive into AMD's and Intel's MSR/PMC documentation and spend the next five days trying to decipher which architectural / non-architectural counter needs to set/used and how, do you have any idea how I can access the performance counters without writing the code myself? - Gilboa [1] opcontrol --setup --vmlinux /usr/lib/debug/lib/modules/2.6.27.9-73.fc9.x86_64/vmlinux --event=DATA_CACHE_ACCESS:1000:0:1:1 [2] opcontrol --setup --vmlinux /usr/lib/debug/lib/modules/2.6.27.9-73.fc9.x86_64/vmlinux --event=L2_CACHE_MISS:1000:0:1:1 [3] opcontrol --setup --vmlinux /usr/lib/debug/lib/modules/2.6.27.9-73.fc9.x86_64/vmlinux --event=CPU_CLK_UNHALTED:10000000:0:1:1 * F10 seems to dislike the DL585G5; Issue already reported against anaconda. (#480638) _______________________________________________ Fedora-kernel-list mailing list Fedora-kernel-list@xxxxxxxxxx https://www.redhat.com/mailman/listinfo/fedora-kernel-list