Re: [RFC PATCH 01/12] dma-buf: Introduce dma_buf_get_pfn_unlocked() kAPI

Christian König <christian.koenig@xxxxxxx> · Wed, 22 Jan 2025 15:59:11 +0100

Am 22.01.25 um 15:37 schrieb Jason Gunthorpe:
My main interest has been what data structure is produced in the
attach APIs.

Eg today we have a struct dma_buf_attachment that returns a sg_table.

I'm expecting some kind of new data structure, lets call it "physical
list" that is some efficient coding of meta/addr/len tuples that works
well with the new DMA API. Matthew has been calling this thing phyr..
I would not use a data structure at all. Instead we should have something
like an iterator/cursor based approach similar to what the new DMA API is
doing.
I'm certainly open to this idea. There may be some technical
challenges, it is a big change from scatterlist today, and
function-pointer-per-page sounds like bad performance if there are
alot of pages..

RDMA would probably have to stuff this immediately into something like
a phyr anyhow because it needs to fully extent the thing being mapped
to figure out what the HW page size and geometry should be - that
would be trivial though, and a RDMA problem.

Now, if you are asking if the current dmabuf mmap callback can be
improved with the above? Maybe? phyr should have the neccessary
information inside it to populate a VMA - eventually even fully
correctly with all the right cachable/encrypted/forbidden/etc flags.
That won't work like this.
Note I said "populate a VMA", ie a helper to build the VMA PTEs only.

See the exporter needs to be informed about page faults on the VMA to
eventually wait for operations to end and sync caches.
All of this would still have to be provided outside in the same way as
today.

For example we have cases with multiple devices are in the same IOMMU domain
and re-using their DMA address mappings.
IMHO this is just another flavour of "private" address flow between
two cooperating drivers.

Well that's the point. The inporter is not cooperating here.

The importer doesn't have the slightest idea that he is sharing it's DMA 
addresses with the exporter.

All the importer gets is when you want to access this information use 
this address here.

It is not a "dma address" in the sense of a dma_addr_t that was output
from the DMA API. I think that subtle distinction is very
important. When I say pfn/dma address I'm really only talking about
standard DMA API flows, used by generic drivers.

IMHO, DMABUF needs a private address "escape hatch", and cooperating
drivers should do whatever they want when using that flow. The address
is *fully private*, so the co-operating drivers can do whatever they
want. iommu_map in exporter and pass an IOVA? Fine! pass a PFN and
iommu_map in the importer? Also fine! Private is private.

But in theory it should be possible to use phyr everywhere eventually, as
long as there's no obviously api-rules-breaking way to go from a phyr back to
a struct page even when that exists.
I would rather say we should stick to DMA addresses as much as possible.
I remain skeptical of this.. Aside from all the technical reasons I
already outlined..

I think it is too much work to have the exporters conditionally build
all sorts of different representations of the same thing depending on
the importer. Like having alot of DRM drivers generate both a PFN and
DMA mapped list in their export code doesn't sound very appealing to
me at all.

Well from experience I can say that it is actually the other way around.

We have a very limited number of exporters and a lot of different 
importers. So having complexity in the exporter instead of the importer 
is absolutely beneficial.

PFN is the special case, in other words this is the private address 
passed around. And I will push hard to not support that in the DRM 
drivers nor any DMA buf heap.

It makes sense that a driver would be able to conditionally generate
private and generic based on negotiation, but IMHO, not more than one
flavour of generic..

I still strongly think that the exporter should talk with the DMA API to 
setup the access path for the importer and *not* the importer directly.

Regards,
Christian.

Jason