Am 22.07.2014 17:42, schrieb Daniel Vetter:
On Tue, Jul 22, 2014 at 5:35 PM, Christian König
<christian.koenig@xxxxxxx> wrote:
Drivers exporting fences need to provide a fence->signaled and a fence->wait
function, everything else like fence->enable_signaling or calling
fence_signaled() from the driver is optional.
Drivers wanting to use exported fences don't call fence->signaled or
fence->wait in atomic or interrupt context, and not with holding any global
locking primitives (like mmap_sem etc...). Holding locking primitives local
to the driver is ok, as long as they don't conflict with anything possible
used by their own fence implementation.
Well that's almost what we have right now with the exception that
drivers are allowed (actually must for correctness when updating
fences) the ww_mutexes for dma-bufs (or other buffer objects).
In this case sorry for so much noise. I really haven't looked in so much
detail into anything but Maarten's Radeon patches.
But how does that then work right now? My impression was that it's
mandatory for drivers to call fence_signaled()?
Locking
correctness is enforced with some extremely nasty lockdep annotations
+ additional debugging infrastructure enabled with
CONFIG_DEBUG_WW_MUTEX_SLOWPATH. We really need to be able to hold
dma-buf ww_mutexes while updating fences or waiting for them. And
obviously for ->wait we need non-atomic context, not just
non-interrupt.
Sounds mostly reasonable, but for holding the dma-buf ww_mutex, wouldn't
be an RCU be more appropriate here? E.g. aren't we just interested that
the current assigned fence at some point is signaled?
Something like grab ww_mutexes, grab a reference to the current fence
object, release ww_mutex, wait for fence, release reference to the fence
object.
Agreed that any shared locks are out of the way (especially stuff like
dev->struct_mutex or other non-strictly driver-private stuff, i915 is
really bad here still).
Yeah that's also an point I've wanted to note on Maartens patch. Radeon
grabs the read side of it's exclusive semaphore while waiting for fences
(because it assumes that the fence it waits for is a Radeon fence).
Assuming that we need to wait in both directions with Prime (e.g. Intel
driver needs to wait for Radeon to finish rendering and Radeon needs to
wait for Intel to finish displaying), this might become a perfect
example of locking inversion.
So from the core fence framework I think we already have exactly this,
and we only need to adjust the radeon implementation a bit to make it
less risky and invasive to the radeon driver logic.
Agree. Well the biggest problem I see is that exclusive semaphore I need
to take when anything calls into the driver. For the fence code I need
to move that down into the fence->signaled handler, cause that now can
be called from outside the driver.
Maarten solved this by telling the driver in the lockup handler (where
we grab the write side of the exclusive lock) that all interrupts are
already enabled, so that fence->signaled hopefully wouldn't mess with
the hardware at all. While this probably works, it just leaves me with a
feeling that we are doing something wrong here.
Christian.
-Daniel
_______________________________________________
dri-devel mailing list
dri-devel@xxxxxxxxxxxxxxxxxxxxx
http://lists.freedesktop.org/mailman/listinfo/dri-devel