On Wed, Feb 15, 2023, at 14:24, Greg Kroah-Hartman wrote: > On Wed, Feb 15, 2023 at 09:18:48PM +0900, Damien Le Moal wrote: >> On 2/15/23 21:01, Greg Kroah-Hartman wrote: >> >>>> @@ -330,6 +330,10 @@ static int pci_epf_test_copy(struct pci_epf_test *epf_test, bool use_dma) >> >>>> enum pci_barno test_reg_bar = epf_test->test_reg_bar; >> >>>> volatile struct pci_epf_test_reg *reg = epf_test->reg[test_reg_bar]; >> >>> >> >>> note, volatile is almost always wrong, please fix that up. >> >> >> >> OK. Will think of something else. >> > >> > If this is io memory, use the proper accessors to access it. If it is >> > not io memory, then why is it marked volatile at all? >> >> This is a PCI bar memory. So I can simply copy the structure locally with >> memcpy_fromio() and memcpy_toio(). > > Great, please do so instead of trying to access it directly like this, > which will break on some platforms. I think the reverse is true here: looking at where the pointer comes from, 'reg' is actually the result of dma_alloc_coherent() in the memory of the local (endpoint) machine, though it appears as a BAR on the remote (host) side and gets mapped with ioremap() there. This means that the host must use readl/write/memcpy_fromio/memcpy_toio to access the buffer, matching the __iomem token there, while the endpoint side not use those. On some machines, readl/write take arguments that are not compatible with normal pointers, and will do something completely different there. A volatile access is not the worst option here, though this conflicts with the '__packed' annotation in the structure definition that may require bytewise access on architectures without unaligned access. I would drop the __packed in the definition, possibly annotating only the 64-bit src_addr and dst_addr members as __packed to ensure the layout is unchanged but the structure as a whole is 32-bit aligned, and then use READ_ONCE()/WRITE_ONCE() to atomically access each member in the coherent buffer. If ordering between the accesses is required, you can add dma_rmb() and dma_wmb() barriers. Arnd