On Fri, Sep 11, 2020 at 07:46:47AM +1000, Benjamin Herrenschmidt wrote: > On Thu, 2020-09-10 at 14:10 -0300, Jason Gunthorpe wrote: > > Can you explain what this actually does on ARM? > > > > Can it ever speculate loads across page boundaries, or speculate > > loads > > that never exist in the program? ie will we get random unpredicable > > MemRds? > > Probably, at least on powerpc you will as well, that's the only way to > get write combine. If I remove the PROT_READ in the user space mmap will it block it? Read TLPs are not harmful but I suspect they would cause an undesirable random performance anomaly. > > Does it/could it "combine writes"? > > I assume so for ARM, definitely for powerpc. Various IBM PPC chips I know work, we do test that. > > > That's why I looped you in - that's what worries me about > > > "enabling" > > > arch_can_pci_mmap_wc() on arm64. If we enable it and we have perf > > > regressions that's not OK. > > > > > > Or we *can* enable arch_can_pci_mmap_wc() but force the mellanox > > > driver (or more broadly all drivers following this message push > > > semantics) to use "something else" for WC detection. > > > > arch_can_pci_mmap_wc() really only controls the sysfs resource file > > and it seems very unclear who in userspace uses that these days. > > dpdk under some circumstances afaik. And something gross for DMA then? Not sure dpdk is useful without DMA. Why not use CONFIG_VFIO_NOIOMMU for such a non-secure thing? > > vfio is now the right way to do that stuff. I don't see an obvious > > way to get WC memory in VFIO though... > > Which would be a performance issue on a number of things I suppose... Almost nothing uses pci_iomap_wc(), so I'd be surpried if userspace DPDK was an important user when an in-kernel driver for the same HW doesn't use it? Jason