Re: [RFC PATCH 01/12] dma-buf: Introduce dma_buf_get_pfn_unlocked() kAPI

Christian König <christian.koenig@xxxxxxx> · Mon, 20 Jan 2025 14:44:13 +0100

Am 21.06.24 um 00:02 schrieb Xu Yilun:
On Thu, Jan 16, 2025 at 04:13:13PM +0100, Christian König wrote:
    Am 15.01.25 um 18:09 schrieb Jason Gunthorpe:

  On Wed, Jan 15, 2025 at 05:34:23PM +0100, Christian König wrote:

     Granted, let me try to improve this.
     Here is a real world example of one of the issues we ran into and why
     CPU mappings of importers are redirected to the exporter.
     We have a good bunch of different exporters who track the CPU mappings
     of their backing store using address_space objects in one way or
     another and then uses unmap_mapping_range() to invalidate those CPU
     mappings.
     But when importers get the PFNs of the backing store they can look
     behind the curtain and directly insert this PFN into the CPU page
     tables.
     We had literally tons of cases like this where drivers developers cause
     access after free issues because the importer created a CPU mappings on
     their own without the exporter knowing about it.
     This is just one example of what we ran into. Additional to that
     basically the whole synchronization between drivers was overhauled as
     well because we found that we can't trust importers to always do the
     right thing.

  But this, fundamentally, is importers creating attachments and then
  *ignoring the lifetime rules of DMABUF*. If you created an attachment,
  got a move and *ignored the move* because you put the PFN in your own
  VMA, then you are not following the attachment lifetime rules!

    Move notify is solely for informing the importer that they need to
    re-fresh their DMA mappings and eventually block for ongoing DMA to end.

    This semantics doesn't work well for CPU mappings because you need to hold
    the reservation lock to make sure that the information stay valid and you
    can't hold a lock while returning from a page fault.
Dealing with CPU mapping and resource invalidation is a little hard, but is
resolvable, by using other types of locks. And I guess for now dma-buf
exporters should always handle this CPU mapping VS. invalidation contention if
they support mmap().

It is resolvable so with some invalidation notify, a decent importers could
also handle the contention well.

That doesn't work like this.

See page tables updates under DMA-buf works by using the same locking 
approach for both the validation and invalidation side. In other words 
we hold the same lock while inserting and removing entries into/from the 
page tables.

That this here should be an unlocked API means that can only use it with 
pre-allocated and hard pinned memory without any chance to invalidate it 
while running. Otherwise you can never be sure of the validity of the 
address information you got from the exporter.

IIUC now the only concern is importer device drivers are easier to do
something wrong, so move CPU mapping things to exporter. But most of the
exporters are also device drivers, why they are smarter?

Exporters always use their invalidation code path no matter if they are 
exporting their buffers for other to use or if they are stand alone.

If you do the invalidation on the importer side you always need both 
exporter and importer around to test it.

Additional to that we have much more importers than exporters. E.g. a 
lot of simple drivers only import DMA-heap buffers and never exports 
anything.

And there are increasing mapping needs, today exporters help handle CPU primary
mapping, tomorrow should they also help on all other mappings? Clearly it is
not feasible. So maybe conditionally give trust to some importers.

Why should that be necessary? Exporters *must* know what somebody does 
with their buffers.

If you have an use case the exporter doesn't support in their mapping 
operation then that use case most likely doesn't work in the first place.

For example direct I/O is enabled/disabled by exporters on their CPU 
mappings based on if that works correctly for them. And importer simply 
doesn't know if they should use vm_insert_pfn() or vm_insert_page().

We could of course implement that logic into each importer to chose 
between the different approaches, but than each importer gains logic it 
only exercises with a specific exporter. And that doesn't seem to be a 
good idea at all.

Regards,
Christian.

Thanks,
Yilun