On Tue, May 26, 2020 at 07:57:49PM +0200, Alexander Dahl wrote: > The intermediate result of the old term (4UL * 1024 * 1024 * 1024) is > 4 294 967 296 or 0x100000000 which is no problem on 64 bit systems. The > patch does not change the later overall result of 0x100000 for > MAX_DMA32_PFN. The new calculation yields the same result, but does not > require 64 bit arithmetic. > > On 32 bit systems the old calculation suffers from an arithmetic > overflow in that intermediate term in braces: 4UL aka unsigned long int > is 4 byte wide and an arithmetic overflow happens (the 0x100000000 does > not fit in 4 bytes), the in braces result is truncated to zero, the > following right shift does not alter that, so MAX_DMA32_PFN evaluates to > 0 on 32 bit systems. > > That wrong value is a problem in a comparision against MAX_DMA32_PFN in > the init code for swiotlb in 'pci_swiotlb_detect_4gb()' to decide if > swiotlb should be active. That comparison yields the opposite result, > when compiling on 32 bit systems. > > This was not possible before 1b7e03ef7570 ("x86, NUMA: Enable emulation > on 32bit too") when that MAX_DMA32_PFN was first made visible to x86_32 > (and which landed in v3.0). > > In practice this wasn't a problem, unless you activated CONFIG_SWIOTLB > on x86 (32 bit). > > However for ARCH=x86 (32 bit) and if you have set CONFIG_IOMMU_INTEL, > since c5a5dc4cbbf4 ("iommu/vt-d: Don't switch off swiotlb if bounce page > is used") there's a dependency on CONFIG_SWIOTLB, which was not > necessarily active before. That landed in v5.4, where we noticed it in > the fli4l Linux distribution. We have CONFIG_IOMMU_INTEL active on both > 32 and 64 bit kernel configs there (I could not find out why, so let's > just say historical reasons). > > The effect is at boot time 64 MiB (default size) were allocated for > bounce buffers now, which is a noticeable amount of memory on small > systems like pcengines ALIX 2D3 with 256 MiB memory, which are still > frequently used as home routers. > > We noticed this effect when migrating from kernel v4.19 (LTS) to v5.4 > (LTS) in fli4l and got that kernel messages for example: > > Linux version 5.4.22 (buildroot@buildroot) (gcc version 7.3.0 (Buildroot 2018.02.8)) #1 SMP Mon Nov 26 23:40:00 CET 2018 > … > Memory: 183484K/261756K available (4594K kernel code, 393K rwdata, 1660K rodata, 536K init, 456K bss , 78272K reserved, 0K cma-reserved, 0K highmem) > … > PCI-DMA: Using software bounce buffering for IO (SWIOTLB) > software IO TLB: mapped [mem 0x0bb78000-0x0fb78000] (64MB) > > The initial analysis and the suggested fix was done by user 'sourcejedi' > at stackoverflow and explicitly marked as GPLv2 for inclusion in the > Linux kernel: > > https://unix.stackexchange.com/a/520525/50007 > > The new calculation, which does not suffer from that overflow, is the > same as for arch/mips now as suggested by Robin Murphy. > > The fix was tested by fli4l users on round about two dozen different > systems, including both 32 and 64 bit archs, bare metal and virtualized > machines. > > Fixes: 1b7e03ef7570 ("x86, NUMA: Enable emulation on 32bit too") > Fixes: https://web.nettworks.org/bugs/browse/FFL-2560 > Fixes: https://unix.stackexchange.com/q/520065/50007 > Reported-by: Alan Jenkins <alan.christopher.jenkins@xxxxxxxxx> > Suggested-by: Robin Murphy <robin.murphy@xxxxxxx> > Signed-off-by: Alexander Dahl <post@xxxxxxxxxxx> > Cc: stable@xxxxxxxxxxxxxxx Reviewed-by: Greg Kroah-Hartman <gregkh@xxxxxxxxxxxxxxxxxxx>