On Thu, Apr 15, 2021 at 4:02 PM Thomas Hellström (Intel) <thomas_os@xxxxxxxxxxxx> wrote: > > > On 4/15/21 3:37 PM, Daniel Vetter wrote: > > On Tue, Apr 13, 2021 at 09:57:06AM +0200, Christian König wrote: > >> > >> Am 13.04.21 um 09:50 schrieb Thomas Hellström: > >>> Hi! > >>> > >>> During the dma_resv conversion of the i915 driver, we've been using ww > >>> transaction lock lists to keep track of ww_mutexes that are locked > >>> during the transaction so that they can be batch unlocked at suitable > >>> locations. Including also the LMEM/VRAM eviction we've ended up with > >>> two static lists per transaction context; one typically unlocked at the > >>> end of transaction and one initialized before and unlocked after each > >>> buffer object validate. This enables us to do sleeping locking at > >>> eviction and keep objects locked on the eviction list until we > >>> eventually succeed allocating memory (modulo minor flaws of course). > >>> > >>> It would be beneficial with the i915 TTM conversion to be able to > >>> introduce a similar functionality that would work in ttm but also > >>> cross-driver in, for example move_notify. It would also be beneficial > >>> to be able to put any dma_resv ww mutex on the lists, and not require > >>> it to be embedded in a particular object type. > >>> > >>> I started scetching on some utilities for this. For TTM, for example, > >>> the idea would be to pass a list head for the ww transaction lock list > >>> in the ttm_operation_ctx. A function taking a ww_mutex could then > >>> either attach a grabbed lock to the list for batch unlocking, or be > >>> responsible for unlocking when the lock's scope is exited. It's also > >>> possible to create sublists if so desired. I believe the below would be > >>> sufficient to cover the i915 functionality. > >>> > >>> Any comments and suggestions appreciated! > >> ah yes Daniel and I haven been discussing something like this for years. > >> > >> I also came up with rough implementation, but we always ran into lifetime > >> issues. > >> > >> In other words the ww_mutexes which are on the list would need to be kept > >> alive until unlocked. > >> > >> Because of this we kind of backed up and said we would need this on the GEM > >> level instead of working with dma_resv objects. > > Yeah there's a few funny concerns here that make this awkward: > > - For simplicity doing these helpers at the gem level should make things a > > bit easier, because then we have a standard way to drop the reference. > > It does mean that the only thing you can lock like this are gem objects, > > but I think that's fine. At least for a first cut. > > > > - This is a bit awkward for vmwgfx, but a) Zack has mentioned he's looking > > into adopting gem bo internally to be able to drop a pile of code and > > stop making vmwgfx the only special-case we have b) drivers which don't > > need this won't need this, so should be fine. > > > > The other awkward thing I guess is that ttm would need to use the > > embedded kref from the gem bo, but that should be transparent I think. > > > > - Next up is dma-buf: For i915 we'd like to do the same eviction trick > > also through p2p dma-buf callbacks, so that this works the same as > > eviction/reservation within a gpu. But for these internal bo you might > > not have a dma-buf, so we can't just lift the trick to the dma-buf > > level. But I think if we pass e.g. a struct list_head and a callback to > > unreference/unlock all the buffers in there to the exporter, plus > > similar for the slowpath lock, then that should be doable without > > glorious layering inversions between dma-buf and gem. > > > > I think for dma-buf it should even be ok if this requires that we > > allocate an entire structure with kmalloc or something, allocating > > memory while holding dma_resv is ok. > > Yes, the thing here with the suggested helpers is that you would just > embed a trans_lockitem struct in the gem object (and defines the gem > object ops). Otherwise and for passing to dma-buf this is pretty much > exactly what you are suggesting, but the huge benefit of encapsulating > the needed members like this is that when we need to change something we > change it in just one place. > > For anything that doesn't have a gem object (dma-buf, vmwgfx or > whatever) you have the choice of either allocating a struct > trans_lockitem or embed it wherever you prefer. In particular, this is > beneficial where you have a single dma-resv class ww-mutex sitting > somewhere in the way and you don't want to needlessly have a gem object > that embeds it. The thing is, everyone who actually uses dma_resv_lock has a gem_buffer_object underneath. So it feels a bit like flexibility for no real need, and I think it would make it slightly more awkard for gem drivers to neatly integrate into their cs path. The lockitem struct works, but it is a bit cumbersome. Also if we add some wrappers to e.g. add a gem_bo to the ctx, then if we decide to slip the lockitem in there, we still only need to touch the helper code, and not all drivers. -Daniel -- Daniel Vetter Software Engineer, Intel Corporation http://blog.ffwll.ch _______________________________________________ Intel-gfx mailing list Intel-gfx@xxxxxxxxxxxxxxxxxxxxx https://lists.freedesktop.org/mailman/listinfo/intel-gfx