On Wed, Jul 24, 2019 at 05:33:05PM +0200, Christoph Hellwig wrote: > On Wed, Jul 24, 2019 at 12:28:58PM -0300, Jason Gunthorpe wrote: > > Humm. Actually having looked this some more, I wonder if this is a > > problem: > > What a mess. > > Question: do we even care for the non-blocking events? The only place > where mmu_notifier_invalidate_range_start_nonblock is called is the oom > killer, which means the process is about to die and the pagetable will > get torn down ASAP. Should we just skip them unconditionally? nouveau > already does so, but amdgpu currently tries to handle the non-blocking > notifications. I think the issue is the pages need to get freed to make the memory available without becoming entangled in risky locks and deadlock. Presumably if we go to the 'torn down ASAP' things get more risky that the whole thing deadlocks? I'm guessing a bit, but I *think* non-blocking here really means something closer to WQ_MEM_RECLAIM, ie you can't do anything that would become entangled with locks in the allocator that are pending on OOM?? (if so we really should call this reclaim not non-blocking) ie for ODP umem_rwsem is held by threads while calling into kmalloc, so when we go to do the exit_mmap we still do the invalidate_range_start and can still end up blocked on a lock that is held by a thread waiting for kmalloc to return. Would be nice to know if this guess is right or not. Jason