Quoting Rodrigo Vivi (2019-02-26 19:53:47) > On Wed, Feb 06, 2019 at 01:03:12PM +0000, Chris Wilson wrote: > > Previously, we were able to rely on the recursive properties of > > struct_mutex to allow us to serialise revoking mmaps and reacquiring the > > FENCE registers with them being clobbered over a global device reset. > > I then proceeded to throw out the baby with the bath water in order to > > pursue a struct_mutex-less reset. > > > > Perusing LWN for alternative strategies, the dilemma on how to serialise > > access to a global resource on one side was answered by > > https://lwn.net/Articles/202847/ -- Sleepable RCU: > > > > 1 int readside(void) { > > 2 int idx; > > 3 rcu_read_lock(); > > 4 if (nomoresrcu) { > > 5 rcu_read_unlock(); > > 6 return -EINVAL; > > 7 } > > 8 idx = srcu_read_lock(&ss); > > 9 rcu_read_unlock(); > > 10 /* SRCU read-side critical section. */ > > 11 srcu_read_unlock(&ss, idx); > > 12 return 0; > > 13 } > > 14 > > 15 void cleanup(void) > > 16 { > > 17 nomoresrcu = 1; > > 18 synchronize_rcu(); > > 19 synchronize_srcu(&ss); > > 20 cleanup_srcu_struct(&ss); > > 21 } > > > > No more worrying about stop_machine, just an uber-complex mutex, > > optimised for reads, with the overhead pushed to the rare reset path. > > > > However, we do run the risk of a deadlock as we allocate underneath the > > SRCU read lock, and the allocation may require a GPU reset, causing a > > dependency cycle via the in-flight requests. We resolve that by declaring > > the driver wedged and cancelling all in-flight rendering. > > > > v2: Use expedited rcu barriers to match our earlier timing > > characteristics. > > v3: Try to annotate locking contexts for sparse > > v4: Reduce selftest lock duration to avoid a reset deadlock with fences > > v5: s/srcu/reset_backoff_srcu/ > > > > Testcase: igt/gem_mmap_gtt/hang > > Fixes: eb8d0f5af4ec ("drm/i915: Remove GPU reset dependence on struct_mutex") > > Hi Chris, > > this patch didn't applied cleanly on dinf > so I noticed that I could also get > > 115ff80a97cf ("drm/i915: Defer removing fence register tracking to rpm wakeup") > > so that applies, but then I noticed that there's a Fixes of > this patch here: > > de3a87cf6352 ("drm/i915: Recursive i915_reset_trylock() verboten") > > It seems sane to pick 3 of them, but I'd like to confirm with you first. > > Thoughts? There's a at least one later patch as well. I think the safest course of option is to ignore these fixups, as the glitch is very rare (after a gpu hang as well), only temporary and does not impact stability or security (they can only see their own buffers/data in a slightly different order). -Chris _______________________________________________ Intel-gfx mailing list Intel-gfx@xxxxxxxxxxxxxxxxxxxxx https://lists.freedesktop.org/mailman/listinfo/intel-gfx