Re: [PATCH 00/37] Replace obj->mm.lock with reservation_ww_class

Tvrtko Ursulin <tvrtko.ursulin@xxxxxxxxxxxxxxx> · Thu, 3 Sep 2020 15:25:05 +0100

On 10/08/2020 10:51, Chris Wilson wrote:
Quoting Tvrtko Ursulin (2020-08-06 10:21:38)

On 05/08/2020 17:22, Thomas Hellström (Intel) wrote:
Hi, Chris,

On 8/5/20 2:21 PM, Chris Wilson wrote:
Long story short, we need to manage evictions using dma_resv & dma_fence
tracking. The backing storage will then be managed using the ww_mutex
borrowed from (and shared via) obj->base.resv, rather than the current
obj->mm.lock.

Skipping over the breadcrumbs,

While perhaps needed fixes, could we submit them as a separate series,
since they, from what I can tell, are not a direct part of the locking
rework, and some of them were actually part of a series that Dave NaK'ed
and may require additional justification?

   the first step is to remove the final
crutches of struct_mutex from execbuf and to broaden the hold for the
dma-resv to guard not just publishing the dma-fences, but for the
duration of the execbuf submission (holding all objects and their
backing store from the point of acquisition to publishing of the final
GPU work, after which the guard is delegated to the dma-fences).

This is of course made complicated by our history. On top of the user's
objects, we also have the HW/kernel objects with their own lifetimes,
and a bunch of auxiliary objects used for working around unhappy HW and
for providing the legacy relocation mechanism. We add every auxiliary
object to the list of user objects required, and attempt to acquire them
en masse. Since all the objects can be known a priori, we can build a
list of those objects and pass that to a routine that can resolve the
-EDEADLK (and evictions). [To avoid relocations imposing a penalty on
sane userspace that avoids them, we do not touch any relocations until
necessary, at will point we have to unroll the state, and rebuild a new
list with more auxiliary buffers to accommodate the extra
copy_from_user].
More examples are included as to how we can break down operations
involving multiple objects into an acquire phase prior to those
operations, keeping the -EDEADLK handling under control.

execbuf is the unique interface in that it deals with multiple user
and kernel buffers. After that, we have callers that in principle care
about accessing a single buffer, and so can be migrated over to a helper
that permits only holding one such buffer at a time. That enables us to
swap out obj->mm.lock for obj->base.resv->lock, and use lockdep to spot
illegal nesting, and to throw away the temporary pins by replacing them
with holding the ww_mutex for the duration instead.

What's changed? Some patch splitting and we need to pull in Matthew's
patch to map the page directories under the ww_mutex.

I would still like to see a justification for the newly introduced async
work, as opposed to add it as an optimizing / regression fixing series
follow the locking rework. That async work introduces a bunch of code
complexity and it would be beneficial to see a discussion of the
tradeoffs and how it alignes with the upstream proposed dma-fence
annotations

On the topic of annotations, maybe do a trybot run with them enabled
with the latest series and then see what pops up.

It didn't change since the run Daniel did. In that run there were two
splats I found,

vm->mutex -> dma_fence_is_signaled/dma_fence_signal -> async work + vm->mutex

Coming back to this after a few weeks - was there any further discussion 
on the dma_fence_is_signaled problem? I don't think Daniel you ever 
replied, or at least I did not see it. Can the signaling section in 
there possibly remain and not be obviously incorrect?

Regards,

Tvrtko

This is the overconstraint Daniel mentioned, it's an entirely false
coupling from the assumption that the async work runs inside the
parent's signaling context. Gut feeling says that should be resolvable
by using lockdep's dependency analysis between local execution/signal
contexts.

The other was

userptr mmu_notifier -> i915_vma_unbind + i915_vma_revoke -> mmu_notifier
[with the interplay between the two global lockmap's, two strikes?]

And that is the well known false positive for userptr that we break by
never allowing the userptr to in be the dev->mapping.

A path not exercised but it should find is synchronous fence eviction.
While binding a fresh fence ensures the vma is idle before starting, we
do not do the same for the fence we steal. That's why care was taken for
execbuf, and it just means teaching the synchronous path to similarly
move the wait outside of the mutex (the simplest will be to wrap the
async handling).
-Chris

_______________________________________________
Intel-gfx mailing list
Intel-gfx@xxxxxxxxxxxxxxxxxxxxx
https://lists.freedesktop.org/mailman/listinfo/intel-gfx