On Tue, May 21, 2019 at 01:16:12PM +0200, Rafał Miłecki wrote: > On Tue, 21 May 2019 at 12:45, Russell King - ARM Linux admin > <linux@xxxxxxxxxxxxxxx> wrote: > > On Tue, May 21, 2019 at 12:28:48PM +0200, Rafał Miłecki wrote: > > > I work on home routers based on Broadcom's Northstar SoCs. Those devices > > > have ARM Cortex-A9 and most of them are dual-core. > > > > > > As for home routers, my main concern is network performance. That CPU > > > isn't powerful enough to handle gigabit traffic so all kind of > > > optimizations do matter. I noticed some unexpected changes in NAT > > > performance when switching between kernels. > > > > > > My hardware is BCM47094 SoC (dual core ARM) with integrated network > > > controller and external BCM53012 switch. > > > > Guessing, I'd say it's to do with the placement of code wrt cachelines. > > That was my guess as well, that's why I tried "cachestat" tool. > > > > You could try aligning some of the cache flushing code to a cache line > > and see what effect that has. > > Can you give me some extra hint on how to do that, please? I tried > searching for it a bit but I didn't find any clear article on that > matter. IIRC, the cache line size on Cortex A9 is 32 bytes, so the assembler directive would be ".align 5". Place that in arch/arm/mm/cache-v7.S before v7_dma_clean_range and v7_dma_inv_range. -- RMK's Patch system: https://www.armlinux.org.uk/developer/patches/ FTTC broadband for 0.8mile line in suburbia: sync at 12.1Mbps down 622kbps up According to speedtest.net: 11.9Mbps down 500kbps up