On 10/14/24 23:38, Mina Almasry wrote:
On Sat, Oct 12, 2024 at 2:42 AM Jason Gunthorpe <jgg@xxxxxxxx> wrote:
On Fri, Oct 11, 2024 at 10:33:43AM -0700, Mina Almasry wrote:
On Thu, Oct 10, 2024 at 6:34 PM Jakub Kicinski <kuba@xxxxxxxxxx> wrote:
On Thu, 10 Oct 2024 10:44:38 -0700 Mina Almasry wrote:
I haven't thought the failure of PP_FLAG_DMA_SYNC_DEV
for dmabuf may be wrong.
I think device memory TCP is not related to this flag.
So device memory TCP core API should not return failure when
PP_FLAG_DMA_SYNC_DEV flag is set.
How about removing this condition check code in device memory TCP core?
I think we need to invert the check..
Mina, WDYT?
On a closer look, my feeling is similar to Taehee,
PP_FLAG_DMA_SYNC_DEV should be orthogonal to memory providers. The
memory providers allocate the memory and provide the dma-addr, but
need not dma-sync the dma-addr, right? The driver can sync the
dma-addr if it wants and the driver can delegate the syncing to the pp
via PP_FLAG_DMA_SYNC_DEV if it wants. AFAICT I think the check should
be removed, not inverted, but I could be missing something.
I don't know much about dmabuf but it hinges on the question whether
doing DMA sync for device on a dmabuf address is :
- a good thing
- a noop
- a bad thing
If it's a good thing or a noop - agreed.
Similar question for the sync for CPU.
I agree that intuitively it should be all fine. But the fact that dmabuf
has a bespoke API for accessing the memory by the CPU makes me worried
that there may be assumptions about these addresses not getting
randomly fed into the normal DMA API..
Sorry I'm also a bit unsure what is the right thing to do here. The
code that we've been running in GVE does a dma-sync for cpu
unconditionally on RX for dma-buf and non-dmabuf dma-addrs and we
haven't been seeing issues. It never does dma-sync for device.
My first question is why is dma-sync for device needed on RX path at
all for some drivers in the first place? For incoming (non-dmabuf)
data, the data is written by the device and read by the cpu, so sync
for cpu is really what's needed. Is the sync for device for XDP? Or is
it that buffers should be dma-syncd for device before they are
re-posted to the NIC?
Christian/Jason, sorry quick question: are
dma_sync_single_for_{device|cpu} needed or wanted when the dma-addrs
come from a dma-buf? Or these dma-addrs to be treated like any other
with the normal dma_sync_for_{device|cpu} rules?
Um, I think because dma-buf hacks things up and generates illegal
scatterlist entries with weird dma_map_resource() addresses for the
typical P2P case the dma sync API should not be used on those things.
However, there is no way to know if the dma-buf has does this, and
there are valid case where the scatterlist is not ill formed and the
sync is necessary.
We are getting soo close to being able to start fixing these API
issues in dmabuf, I hope next cylce we can begin.. Fingers crossed.
From a CPU architecture perspective you do not need to cache flush PCI
MMIO BAR memory, and perhaps doing so be might be problematic on some
arches (???). But you do need to flush normal cachable CPU memory if
that is in the DMA buf.
Thanks Jason. In that case I agree with Jakub we should take in his change here:
https://lore.kernel.org/netdev/20241009170102.1980ed1d@xxxxxxxxxx/
With this change the driver would delegate dma_sync_for_device to the
page_pool, and the page_pool will skip it altogether for the dma-buf
memory provider.
Requiring ->dma_map should be common to all providers as page pool
shouldn't be dipping to net_iovs figuring out how to map them. However,
looking at this discussion seems that the ->dma_sync concern is devmem
specific and should be discarded by pp providers using dmabufs, i.e. in
devmem.c:mp_dmabuf_devmem_init().
--
Pavel Begunkov