Re: [PATCH 2/5] kernel.h: Add non_block_start/end()

Jason Gunthorpe <jgg@xxxxxxxx> · Thu, 15 Aug 2019 14:35:57 -0300

On Thu, Aug 15, 2019 at 06:25:16PM +0200, Daniel Vetter wrote:

> I'm not really well versed in the details of our userptr, but both
> amdgpu and i915 wait for the gpu to complete from
> invalidate_range_start. Jerome has at least looked a lot at the amdgpu
> one, so maybe he can explain what exactly it is we're doing ...

amdgpu is (wrongly) using hmm for something, I can't really tell what
it is trying to do. The calls to dma_fence under the
invalidate_range_start do not give me a good feeling.

However, i915 shows all the signs of trying to follow the registration
cache model, it even has a nice comment in
i915_gem_userptr_get_pages() explaining that the races it has don't
matter because it is a user space bug to change the VA mapping in the
first place. That just screams registration cache to me.

So it is fine to run HW that way, but if you do, there is no reason to
fence inside the invalidate_range end. Just orphan the DMA buffer and
clean it up & release the page pins when all DMA buffer refs go to
zero. The next access to that VA should get a new DMA buffer with the
right mapping.

In other words the invalidation should be very simple without
complicated locking, or wait_event's. Look at hfi1 for example.

Jason