On Wed, Jan 08, 2020 at 05:16:40PM -0800, Ralph Campbell wrote: > I hit this while testing HMM with nouveau on linux-5.5-rc5. > I'm not a lockdep expert but my understanding of this is that an > invalidation callback could potentially call kzalloc(GFP_KERNEL) > which could cause another invalidation and recursively deadlock. > Looking at the drivers/gpu/drm/nouveau/nvkm/ layer, I do see a > number of places where GFP_KERNEL is used for allocations and I > don't see an easy way to avoid that. Not quite.. Any lock held by the invalidation callback becomes a lock where GFP_KERNEL cannot be used within it's critical region. Ie we can't have a notifier callback block on a lock which is held by another thread which is blocked on GFP_KERNEL as we now risk deadlocking on other mm locks if that allocation triggers reclaim. AFAIK there is no fix from the core side. The driver must respect this and be organized to deal with it. Daniel fixed the intel driver already, I fixed RDMA recently, the other drivers must also be fixed. Some choices - Split up the lock held by the notifier callback so it doesn't need to cover allocations - Use GFP_ATOMIC for allocations - Speculatively do allocations before obtaining the lock and free if they were not needed. I suppose it will be some troublbe for nouveau, but it must be done there.. Jason