On Thu, Aug 15, 2019 at 5:10 PM Jason Gunthorpe <jgg@xxxxxxxx> wrote: > > On Thu, Aug 15, 2019 at 04:43:38PM +0200, Daniel Vetter wrote: > > > You have to wait for the gpu to finnish current processing in > > invalidate_range_start. Otherwise there's no point to any of this > > really. So the wait_event/dma_fence_wait are unavoidable really. > > I don't envy your task :| > > But, what you describe sure sounds like a 'registration cache' model, > not the 'shadow pte' model of coherency. > > The key difference is that a regirstationcache is allowed to become > incoherent with the VMA's because it holds page pins. It is a > programming bug in userspace to change VA mappings via mmap/munmap/etc > while the device is working on that VA, but it does not harm system > integrity because of the page pin. > > The cache ensures that each initiated operation sees a DMA setup that > matches the current VA map when the operation is initiated and allows > expensive device DMA setups to be re-used. > > A 'shadow pte' model (ie hmm) *really* needs device support to > directly block DMA access - ie trigger 'device page fault'. ie the > invalidate_start should inform the device to enter a fault mode and > that is it. If the device can't do that, then the driver probably > shouldn't persue this level of coherency. The driver would quickly get > into the messy locking problems like dma_fence_wait from a notifier. > > It is important to identify what model you are going for as defining a > 'registration cache' coherence expectation allows the driver to skip > blocking in invalidate_range_start. All it does is invalidate the > cache so that future operations pick up the new VA mapping. > > Intel's HFI RDMA driver uses this model extensively, and I think it is > well proven, within some limitations of course. > > At least, 'registration cache' is the only use model I know of where > it is acceptable to skip invalidate_range_end. I'm not really well versed in the details of our userptr, but both amdgpu and i915 wait for the gpu to complete from invalidate_range_start. Jerome has at least looked a lot at the amdgpu one, so maybe he can explain what exactly it is we're doing ... -Daniel -- Daniel Vetter Software Engineer, Intel Corporation +41 (0) 79 365 57 48 - http://blog.ffwll.ch