On Wed, Sep 16, 2020 at 05:59:24PM +1000, Benjamin Herrenschmidt wrote: > On Tue, 2020-09-15 at 20:40 -0300, Jason Gunthorpe wrote: > > Not quite, upstream kernel will never use WC on those > > devices. DEVICE_GRE is not supported in upstream, > > arch_can_pci_mmap_wc() is always false and the WC tester will always > > fail. > > > > > With the patch, those device will now use MT_DEVICE_NC. > > > > Which doesn't do WC at all on some ARM implementations. > > Lovely... this is arm64 btw, still the case ? Yep > Also we could make this a variable rather than a constant and choose > a more appropriate set of flags at boot time.... It is a function, so it could check the CPU ID for the known broken devices and block them. > > > Why would that be a regression ? > > > > Using the WC submission flow when it doesn't work costs something like > > 10% performance vs using the non-WC flow. > > You mean the driver uses a different path to the HW which ahs that > overhead, not that MMIOs have that overhead right ? The different path has overhead of doing extra useless MMIOs because they don't combine Jason