Re: [PATCH 00/37] Replace obj->mm.lock with reservation_ww_class

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 




On 06/08/2020 12:55, Daniel Vetter wrote:
On Thu, Aug 6, 2020 at 11:21 AM Tvrtko Ursulin
<tvrtko.ursulin@xxxxxxxxxxxxxxx> wrote:
On 05/08/2020 17:22, Thomas Hellström (Intel) wrote:
Hi, Chris,
On 8/5/20 2:21 PM, Chris Wilson wrote:
Long story short, we need to manage evictions using dma_resv & dma_fence
tracking. The backing storage will then be managed using the ww_mutex
borrowed from (and shared via) obj->base.resv, rather than the current
obj->mm.lock.

Skipping over the breadcrumbs,

While perhaps needed fixes, could we submit them as a separate series,
since they, from what I can tell, are not a direct part of the locking
rework, and some of them were actually part of a series that Dave NaK'ed
and may require additional justification?


   the first step is to remove the final
crutches of struct_mutex from execbuf and to broaden the hold for the
dma-resv to guard not just publishing the dma-fences, but for the
duration of the execbuf submission (holding all objects and their
backing store from the point of acquisition to publishing of the final
GPU work, after which the guard is delegated to the dma-fences).

This is of course made complicated by our history. On top of the user's
objects, we also have the HW/kernel objects with their own lifetimes,
and a bunch of auxiliary objects used for working around unhappy HW and
for providing the legacy relocation mechanism. We add every auxiliary
object to the list of user objects required, and attempt to acquire them
en masse. Since all the objects can be known a priori, we can build a
list of those objects and pass that to a routine that can resolve the
-EDEADLK (and evictions). [To avoid relocations imposing a penalty on
sane userspace that avoids them, we do not touch any relocations until
necessary, at will point we have to unroll the state, and rebuild a new
list with more auxiliary buffers to accommodate the extra
copy_from_user].
More examples are included as to how we can break down operations
involving multiple objects into an acquire phase prior to those
operations, keeping the -EDEADLK handling under control.

execbuf is the unique interface in that it deals with multiple user
and kernel buffers. After that, we have callers that in principle care
about accessing a single buffer, and so can be migrated over to a helper
that permits only holding one such buffer at a time. That enables us to
swap out obj->mm.lock for obj->base.resv->lock, and use lockdep to spot
illegal nesting, and to throw away the temporary pins by replacing them
with holding the ww_mutex for the duration instead.

What's changed? Some patch splitting and we need to pull in Matthew's
patch to map the page directories under the ww_mutex.

I would still like to see a justification for the newly introduced async
work, as opposed to add it as an optimizing / regression fixing series
follow the locking rework. That async work introduces a bunch of code
complexity and it would be beneficial to see a discussion of the
tradeoffs and how it alignes with the upstream proposed dma-fence
annotations

On the topic of annotations, maybe do a trybot run with them enabled
with the latest series and then see what pops up.

+Daniel, since I noticed last time he was doing that one of the splats
(possibly the only one?) was actually caused by dma_fence_is_signaled.
Which I think comes under the opportunistic signaling rule for the
annotation kerneldoc so looked like a false positive to me. Not sure how
to avoid that one, apart from making it call a special, un-annotated,
flavours of dma_fence_signal(_locked).

Yeah that became a bit more constrained due to the switch from
recursive locks (which don't catch the actual wait vs signalling
issues because lockdep is not as good as it should be) to explicitly
recursive read locks (which do catch the wait vs signal side of
things, but get more annoyed about creative locking schemes on the
read (i.e. signalling side).

I did not get you, what became more constrained? Allowance for opportunistic signaling got retracted? But dma_fence_is_signaled is not even that, I don't see how it makes sense to flag it as foul.

Or you agree dma_fence_is_signaled should be made call new non-annotated helpers as I suggested?

Regards,

Tvrtko
_______________________________________________
Intel-gfx mailing list
Intel-gfx@xxxxxxxxxxxxxxxxxxxxx
https://lists.freedesktop.org/mailman/listinfo/intel-gfx




[Index of Archives]     [AMD Graphics]     [Linux USB Devel]     [Linux Audio Users]     [Yosemite News]     [Linux Kernel]     [Linux SCSI]

  Powered by Linux