On Mon, Mar 19, 2018 at 5:23 PM, Christian König <ckoenig.leichtzumerken@xxxxxxxxx> wrote: > Am 19.03.2018 um 16:53 schrieb Chris Wilson: >> >> Quoting Christian König (2018-03-16 14:22:32) >> [snip, probably lost too must context] >>> >>> This allows for full grown pipelining, e.g. the exporter can say I need >>> to move the buffer for some operation. Then let the move operation wait >>> for all existing fences in the reservation object and install the fence >>> of the move operation as exclusive fence. >> >> Ok, the situation I have in mind is the non-pipelined case: revoking >> dma-buf for mmu_invalidate_range or shrink_slab. I would need a >> completion event that can be waited on the cpu for all the invalidate >> callbacks. (Essentially an atomic_t counter plus struct completion; a >> lighter version of dma_fence, I wonder where I've seen that before ;) > > > Actually that is harmless. > > When you need to unmap a DMA-buf because of mmu_invalidate_range or > shrink_slab you need to wait for it's reservation object anyway. reservation_object only prevents adding new fences, you still have to wait for all the current ones to signal. Also, we have dma-access without fences in i915. "I hold the reservation_object" does not imply you can just go and nuke the backing storage. > This needs to be done to make sure that the backing memory is now idle, it > doesn't matter if the jobs where submitted by DMA-buf importers or your own > driver. > > The sg tables pointing to the now released memory might live a bit longer, > but that is unproblematic and actually intended. I think that's very problematic. One reason for an IOMMU is that you have device access isolation, and a broken device can't access memory it shouldn't be able to access. From that security-in-depth point of view it's not cool that there's some sg tables hanging around still that a broken GPU could use. And let's not pretend hw is perfect, especially GPUs :-) > When we would try to destroy the sg tables in an mmu_invalidate_range or > shrink_slab callback we would run into a lockdep horror. So I'm no expert on this, but I think this is exactly what we're doing in i915. Kinda no other way to actually free the memory without throwing all the nice isolation aspects of an IOMMU into the wind. Can you please paste the lockdeps you've seen with amdgpu when trying to do that? -Daniel > > Regards, > Christian. > >> >> Even so, it basically means passing a fence object down to the async >> callbacks for them to signal when they are complete. Just to handle the >> non-pipelined version. :| >> -Chris >> _______________________________________________ >> dri-devel mailing list >> dri-devel@xxxxxxxxxxxxxxxxxxxxx >> https://lists.freedesktop.org/mailman/listinfo/dri-devel > > > _______________________________________________ > Linaro-mm-sig mailing list > Linaro-mm-sig@xxxxxxxxxxxxxxxx > https://lists.linaro.org/mailman/listinfo/linaro-mm-sig -- Daniel Vetter Software Engineer, Intel Corporation +41 (0) 79 365 57 48 - http://blog.ffwll.ch