Hi, I work on home routers based on Broadcom's Northstar SoCs. Those devices have ARM Cortex-A9 and most of them are dual-core. As for home routers, my main concern is network performance. That CPU isn't powerful enough to handle gigabit traffic so all kind of optimizations do matter. I noticed some unexpected changes in NAT performance when switching between kernels. My hardware is BCM47094 SoC (dual core ARM) with integrated network controller and external BCM53012 switch. Relevant setup: * SoC network controller is wired to the hardware switch * Switch passes 802.1q frames with VID 1 to four LAN ports * Switch passes 802.1q frames with VID 2 to WAN port * Linux does NAT for LAN (eth0.1) to WAN (eth0.2) * Linux uses pfifo and "echo 2 > rps_cpus" * Ryzen 5 PRO 2500U (x86_64) laptop connected to a LAN port * Intel i7-2670QM laptop connected to a WAN port ***** I found a very nice example of commit that does /nothing/ yet it affects NAT performance: 9316a9ed6895 ("blk-mq: provide helper for setting up an SQ queue and tag set") https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/?id=9316a9ed6895c4ad2f0cde171d486f80c55d8283 All it does is exporting an unused symbol (function). Let me share some numbers (I use iperf for testing): git reset --hard v4.19 git am OpenWrt-mtd-chages.patch [ 3] 0.0-30.0 sec 2.60 GBytes 745 Mbits/sec [ 3] 0.0-30.0 sec 2.60 GBytes 745 Mbits/sec [ 3] 0.0-30.0 sec 2.60 GBytes 744 Mbits/sec [ 3] 0.0-30.0 sec 2.59 GBytes 742 Mbits/sec [ 3] 0.0-30.0 sec 2.59 GBytes 740 Mbits/sec [ 3] 0.0-30.0 sec 2.59 GBytes 740 Mbits/sec [ 3] 0.0-30.0 sec 2.58 GBytes 738 Mbits/sec [ 3] 0.0-30.0 sec 2.58 GBytes 738 Mbits/sec [ 3] 0.0-30.0 sec 2.58 GBytes 738 Mbits/sec [ 3] 0.0-30.0 sec 2.57 GBytes 735 Mbits/sec Average: 741 Mb/s git reset --hard v4.19 git am OpenWrt-mtd-chages.patch git cherry-pick -x 9316a9ed6895 [ 3] 0.0-30.0 sec 2.73 GBytes 780 Mbits/sec [ 3] 0.0-30.0 sec 2.72 GBytes 777 Mbits/sec [ 3] 0.0-30.0 sec 2.71 GBytes 775 Mbits/sec [ 3] 0.0-30.0 sec 2.70 GBytes 773 Mbits/sec [ 3] 0.0-30.0 sec 2.70 GBytes 771 Mbits/sec [ 3] 0.0-30.0 sec 2.69 GBytes 771 Mbits/sec [ 3] 0.0-30.0 sec 2.69 GBytes 771 Mbits/sec [ 3] 0.0-30.0 sec 2.69 GBytes 770 Mbits/sec [ 3] 0.0-30.0 sec 2.69 GBytes 769 Mbits/sec [ 3] 0.0-30.0 sec 2.68 GBytes 768 Mbits/sec Average: 773 Mb/s As you can see cherry-picking (on top of Linux 4.19) a single commit that does /nothing/ can improve NAT performance by 4,5%. ***** I was hoping to learn something from profiling kernel with the "perf" tool. Eanbling CONFIG_PERF_EVENTS resulted in smaller NAT performance gain: 741 Mb/s → 750 Mb/s. I tried it anyway. Without cherry-picking I got: + 9,04% swapper [kernel.kallsyms] [k] v7_dma_inv_range + 5,54% swapper [kernel.kallsyms] [k] __irqentry_text_end + 5,12% swapper [kernel.kallsyms] [k] l2c210_inv_range + 4,30% ksoftirqd/1 [kernel.kallsyms] [k] v7_dma_clean_range + 4,02% swapper [kernel.kallsyms] [k] bcma_host_soc_read32 + 3,13% swapper [kernel.kallsyms] [k] arch_cpu_idle + 2,88% ksoftirqd/1 [kernel.kallsyms] [k] __netif_receive_skb_core + 2,51% ksoftirqd/1 [kernel.kallsyms] [k] l2c210_clean_range + 1,88% ksoftirqd/1 [kernel.kallsyms] [k] fib_table_lookup (741 Mb/s while *not* running perf) With cherry-picked 9316a9ed6895 I got: + 9,16% swapper [kernel.kallsyms] [k] v7_dma_inv_range + 5,64% swapper [kernel.kallsyms] [k] __irqentry_text_end + 5,05% swapper [kernel.kallsyms] [k] l2c210_inv_range + 4,25% ksoftirqd/1 [kernel.kallsyms] [k] v7_dma_clean_range + 4,10% swapper [kernel.kallsyms] [k] bcma_host_soc_read32 + 3,35% ksoftirqd/1 [kernel.kallsyms] [k] __netif_receive_skb_core + 3,17% swapper [kernel.kallsyms] [k] arch_cpu_idle + 2,49% ksoftirqd/1 [kernel.kallsyms] [k] l2c210_clean_range + 2,03% ksoftirqd/1 [kernel.kallsyms] [k] fib_table_lookup (750 Mb/s while *not* running perf) Changes seem quite minimal and I'm not sure if they tell what is causing that NAT performance change at all. ***** I also tried running cachestat but didn't get anything interesting: Counting cache functions... Output every 1 seconds. TIME HITS MISSES DIRTIES RATIO BUFFERS_MB CACHE_MB 10:06:59 1020 5 0 99.5% 0 2 10:07:00 1029 0 0 100.0% 0 2 10:07:01 1013 0 0 100.0% 0 2 10:07:02 1029 0 0 100.0% 0 2 10:07:03 1029 0 0 100.0% 0 2 10:07:04 997 0 0 100.0% 0 2 10:07:05 1013 0 0 100.0% 0 2 (I started iperf at 10:07:00). ***** There were more situations with such unexpected performance changes. Another example: cherry-picking 5b0890a97204 ("flow_dissector: Parse batman-adv unicast headers") https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/?id=5b0890a97204627d75a333fc30f29f737e2bfad6 to some Linux 4.14.x release was lowering NAT performance by 55 Mb/s. The tricky part is there aren't any ETH_P_BATMAN packets in my traffic. Extra tests revealed that any __skb_flow_dissect() modification was lowering my NAT performance (e.g. commenting out ETH_P_TIPC or ETH_P_FCOE switch cases). ***** I would like every kernel to provide a maximum NAT performance, no matter what random commits it contains. Suffering from such a random changes makes it also really hard to notice a real performance regression. Do you have any idea what is causing those performance changes? Can I provide any extra info to help debugging this?
047-v4.21-mtd-keep-original-flags-for-every-struct-mtd_info.patch 048-v4.21-mtd-improve-calculating-partition-boundaries-when-ch.patch 080-v5.1-0001-bcma-keep-a-direct-pointer-to-the-struct-device.patch 080-v5.1-0002-bcma-use-dev_-printing-functions.patch 095-Allow-class-e-address-assignment-via-ifconfig-ioctl.patch 140-jffs2-use-.rename2-and-add-RENAME_WHITEOUT-support.patch 141-jffs2-add-RENAME_EXCHANGE-support.patch 400-mtd-add-rootfs-split-support.patch 401-mtd-add-support-for-different-partition-parser-types.patch 402-mtd-use-typed-mtd-parsers-for-rootfs-and-firmware-split.patch 403-mtd-hook-mtdsplit-to-Kbuild.patch 404-mtd-add-more-helper-functions.patch 431-mtd-bcm47xxpart-check-for-bad-blocks-when-calculatin.patch 432-mtd-bcm47xxpart-detect-T_Meter-partition.patch 480-mtd-set-rootfs-to-be-root-dev.patch 490-ubi-auto-attach-mtd-device-named-ubi-or-data-on-boot.patch 491-ubi-auto-create-ubiblock-device-for-rootfs.patch 492-try-auto-mounting-ubi0-rootfs-in-init-do_mounts.c.patch 493-ubi-set-ROOT_DEV-to-ubiblock-rootfs-if-unset.patch 530-jffs2_make_lzma_available.patch 532-jffs2_eofdetect.patch 500-v4.20-ubifs-Fix-default-compression-selection-in-ubifs.patch 553-ubifs-Add-option-to-create-UBI-FS-version-4-on-empty.patch 700-swconfig_switch_drivers.patch 702-phy_add_aneg_done_function.patch 721-phy_packets.patch 773-bgmac-add-srab-switch.patch 910-kobject_uevent.patch 911-kobject_add_broadcast_uevent.patch