Re: [PATCH v3 9/9] PCI: endpoint: Set prefetch when allocating memory for 64-bit BARs

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On Mon, Mar 18, 2024, at 16:13, Niklas Cassel wrote:
> On Mon, Mar 18, 2024 at 08:25:36AM +0100, Arnd Bergmann wrote:
>
> I personally just care about pci-epf-test, but obviously I don't
> want to regress any other user of pci_epf_alloc_space().
>
> Looking at the endpoint side driver:
> drivers/pci/endpoint/functions/pci-epf-test.c
> and the host side driver:
> drivers/misc/pci_endpoint_test.c
>
> On the RC side, allocating buffers that the EP will DMA to is
> done using: kzalloc() + dma_map_single().
>
> On EP side:
> drivers/pci/endpoint/functions/pci-epf-test.c
> uses dma_map_single() when using DMA, and signals completion using MSI.
>
> On EP side:
> When reading/writing to the BARs, it simply does:
> READ_ONCE()/WRITE_ONCE():
> https://github.com/torvalds/linux/blob/v6.8/drivers/pci/endpoint/functions/pci-epf-test.c#L643-L648
>
> There is no dma_sync(), so the pci-test-epf driver currently seems to
> depend on the backing memory being allocated by dma_alloc_coherent().

>From my reading of that function, this is really some kind
of command buffer that implements individual structured
registers and can be accessed from both sides at the same
time, so it would not actually make sense with the streaming
interface and wc/prefetchable access in place of explicit
READ_ONCE/WRITE_ONCE and readl/writel accesses.

>> If you don't care about ordering on that level, I would use
>> dma_map_sg() on the endpoint side and prefetchable mapping on
>> the host side, with the endpoint using dma_sync_*() to pass
>> buffer ownership between the two sides, as controlled by some
>> other communication method (non-prefetchable BAR, MSI, ...).
>
> I don't think that there is no big reason why pci-epf-test is
> implemented using dma_alloc_coherent() rather than dma_sync()
> for the memory backing the BARs, but that is the way it is.
>
> Since I don't feel like totally rewriting pci-epf-test, and since
> you say that we shouldn't use dma_alloc_coherent() for the memory
> backing the BARs together with exporting the BAR as prefetchable,
> I will drop this patch from the series in the next revision.

Ok. It might still be useful to extend the driver to also
allow transferring streaming data through a BAR on the
endpoint side. From what I can tell, it currently supports
using either slave DMA or a RC side buffer that ioremapped
into the endpoint, but that uses a regular ioremap() as well.
Mapping the RC side buffer as WC should make it possible to
transfer data from EP to RC more efficiently, but for the RC
to EP transfers you really want the buffer to be allocated on
the EP, so you can ioremap_wc() it to the RC for a memcpy_toio,
or cacheable read from the EP.

      Arnd




[Index of Archives]     [DMA Engine]     [Linux Coverity]     [Linux USB]     [Video for Linux]     [Linux Audio Users]     [Yosemite News]     [Linux Kernel]     [Linux SCSI]     [Greybus]

  Powered by Linux