Long story short, we need to manage evictions using dma_resv & dma_fence tracking. The backing storage will then be managed using the ww_mutex borrowed from (and shared via) obj->base.resv, rather than the current obj->mm.lock. Skipping over the breadcrumbs, the first step is to remove the final crutches of struct_mutex from execbuf and to broaden the hold for the dma-resv to guard not just publishing the dma-fences, but for the duration of the execbuf submission (holding all objects and their backing store from the point of acquisition to publishing of the final GPU work, after which the guard is delegated to the dma-fences). This is of course made complicated by our history. On top of the user's objects, we also have the HW/kernel objects with their own lifetimes, and a bunch of auxiliary objects used for working around unhappy HW and for providing the legacy relocation mechanism. We add every auxiliary object to the list of user objects required, and attempt to acquire them en masse. Since all the objects can be known a priori, we can build a list of those objects and pass that to a routine that can resolve the -EDEADLK (and evictions). [To avoid relocations imposing a penalty on sane userspace that avoids them, we do not touch any relocations until necessary, at will point we have to unroll the state, and rebuild a new list with more auxiliary buffers to accommodate the extra copy_from_user]. More examples are included as to how we can break down operations involving multiple objects into an acquire phase prior to those operations, keeping the -EDEADLK handling under control. execbuf is the unique interface in that it deals with multiple user and kernel buffers. After that, we have callers that in principle care about accessing a single buffer, and so can be migrated over to a helper that permits only holding one such buffer at a time. That enables us to swap out obj->mm.lock for obj->base.resv->lock, and use lockdep to spot illegal nesting, and to throw away the temporary pins by replacing them with holding the ww_mutex for the duration instead. What's changed? Some patch splitting and we need to pull in Matthew's patch to map the page directories under the ww_mutex. _______________________________________________ Intel-gfx mailing list Intel-gfx@xxxxxxxxxxxxxxxxxxxxx https://lists.freedesktop.org/mailman/listinfo/intel-gfx