Am 23.01.25 um 14:59 schrieb Jason Gunthorpe:
On Wed, Jan 22, 2025 at 03:59:11PM +0100, Christian König wrote:
For example we have cases with multiple devices are in the same IOMMU domain
and re-using their DMA address mappings.
IMHO this is just another flavour of "private" address flow between
two cooperating drivers.
Well that's the point. The inporter is not cooperating here.
If the private address relies on a shared iommu_domain controlled by
the driver, then yes, the importer MUST be cooperating. For instance,
if you send the same private address into RDMA it will explode because
it doesn't have any notion of shared iommu_domain mappings, and it
certainly doesn't setup any such shared domains.
Hui? Why the heck should a driver own it's iommu domain?
The domain is owned and assigned by the PCI subsystem under Linux.
The importer doesn't have the slightest idea that he is sharing it's DMA
addresses with the exporter.
Of course it does. The importer driver would have had to explicitly
set this up! The normal kernel behavior is that all drivers get
private iommu_domains controled by the DMA API. If your driver is
doing something else *it did it deliberately*.
As far as I know that is simply not correct. Currently IOMMU
domains/groups are usually shared between devices.
Especially multi function devices get only a single IOMMU domain.
Some of that mess in tegra host1x around this area is not well
structured, it should not be implicitly setting up domains for
drivers. It is old code that hasn't been updated to use the new iommu
subsystem approach for driver controled non-DMA API domains.
The new iommu architecture has the probing driver disable the DMA API
and can then manipulate its iommu domain however it likes, safely. Ie
the probing driver is aware and particiapting in disabling the DMA
API.
Why the heck should we do this?
That drivers manage all of that on their own sounds like a massive
step in the wrong direction.
Again, either you are using the DMA API and you work in generic ways
with generic devices or it is "private" and only co-operating drivers
can interwork with private addresses. A private address must not ever
be sent to a DMA API using driver and vice versa.
IMHO this is an important architecture point and why Christoph was
frowning on abusing dma_addr_t to represent things that did NOT come
out of the DMA API.
We have a very limited number of exporters and a lot of different importers.
So having complexity in the exporter instead of the importer is absolutely
beneficial.
Isn't every DRM driver both an importer and exporter? That is what I
was expecting at least..
I still strongly think that the exporter should talk with the DMA API to
setup the access path for the importer and *not* the importer directly.
It is contrary to the design of the new API which wants to co-optimize
mapping and HW setup together as one unit.
Yeah and I'm really questioning this design goal. That sounds like
totally going into the wrong direction just because of the RDMA drivers.
For instance in RDMA we want to hint and control the way the IOMMU
mapping works in the DMA API to optimize the RDMA HW side. I can't do
those optimizations if I'm not in control of the mapping.
Why? What is the technical background here?
The same is probably true on the GPU side too, you want IOVAs that
have tidy alignment with your PTE structure, but only the importer
understands its own HW to make the correct hints to the DMA API.
Yeah but then express those as requirements to the DMA API and not
move all the important decisions into the driver where they are
implemented over and over again and potentially broken halve the time.
See drivers are supposed to be simple, small and stupid. They should
be controlled by the core OS and not allowed to do whatever they want.
Driver developers are not trust able to always get everything right if
you make it as complicated as this.
Regards,
Christian.
Jason