Re: [RFC] add __iomem cookie for EPF BAR

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On Tue, 08 Aug 2023 15:44:44 +0800,
Arnd Bergmann wrote:
Hi Arnd,
> 
> On Tue, Aug 8, 2023, at 09:03, mani wrote:
> > On Mon, Aug 07, 2023 at 08:28:30PM +0800, Li Chen wrote:
> >> 
> >> Currently, the EPF's bar is allocated by pci_epf_alloc_space, which internally uses dma_alloc_coherent and the caching behavior of dma_alloc_coherent may vary depending on platforms.
> >> 
> >> The bar space is exported to RC, which means that RC may modify it without EP being aware of it, so EP still read from the cache and get stalled data. To address this issue, the bar space should be treated as iomem instead and forced to use io read/write APIs, which enforces volatile. 
> >> 
> >
> > We already had a similar discussion on using volatile for BAR space and settled
> > with {WRITE/READ}_ONCE macros in EPF Test driver [1].
> >
> > Since the BAR space allocated in endpoint is not a MMIO, I don't think it should
> > be forced as iomem. Rather EPF drivers should use _ONCE macros to access the
> > fields to avoid coherency issues as suggested by Arnd earlier.
> 
> Using readl/writel is clearly the wrong solution here as I explained
> before, but I assume that Li Chen is dealing with a real problem.

Thanks, I learnt much from your mail.
Actually, I'm not dealing with a real problem.

> If the cache is coherent with the device, then reading from the cache
> is clearly the right thing to do,

I guess that even SoCs with CCI support might not handle cache for RC
access if specific bus interfaces are not connected.

> but the mentioned "stall" problem may
> be related to the store buffers, where an dma_wmb() after the
> WRITE_ONCE() is missing. Similarly, a dma_rmb() might be missing before
> a READ_ONCE() to prevent prefetching during out-of-order execution.
> 
> With readl()/writel(), you already get very heavy barriers, so it may
> end up working by accident, but these barriers are at the other side
> of the access (before writel and after readl) and may be the wrong
> type of barrier depending on the CPU.

For systems that aren't cache-coherent, is it accurate to say that the store
buffer might still be utilized, and that there might still be a need for dma_wmb and dma_rmb?



[Index of Archives]     [DMA Engine]     [Linux Coverity]     [Linux USB]     [Video for Linux]     [Linux Audio Users]     [Yosemite News]     [Linux Kernel]     [Linux SCSI]     [Greybus]

  Powered by Linux