Re: [PATCH v3 1/6] drm/i915: Fix request locking during error capture & debugfs dump

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On 1/19/2023 07:16, Andy Shevchenko wrote:
On Wed, Jan 18, 2023 at 10:49:55PM -0800, John.C.Harrison@xxxxxxxxx wrote:
From: John Harrison <John.C.Harrison@xxxxxxxxx>

When GuC support was added to error capture, the locking around the
request object was broken. Fix it up.

The context based search manages the spinlocking around the search
internally. So it needs to grab the reference count internally as
well. The execlist only request based search relies on external
locking, so it needs an external reference count but within the
spinlock not outside it.

The only other caller of the context based search is the code for
dumping engine state to debugfs. That code wasn't previously getting
an explicit reference at all as it does everything while holding the
execlist specific spinlock. So, that needs updaing as well as that
spinlock doesn't help when using GuC submission. Rather than trying to
conditionally get/put depending on submission model, just change it to
always do the get/put.

In addition, intel_guc_find_hung_context() was not acquiring the
correct spinlock before searching the request list. So fix that up
too. While at it, add some extra whitespace padding for readability.
...

+		found = false;
+		spin_lock(&ce->guc_state.lock);
  		list_for_each_entry(rq, &ce->guc_state.requests, sched.link) {
  			if (i915_test_request_state(rq) != I915_REQUEST_ACTIVE)
  				continue;
+ found = true;
+			break;
+		}
This can be combined to (see also below)

		list_for_each_entry(rq, &ce->guc_state.requests, sched.link) {
			if (i915_test_request_state(rq) == I915_REQUEST_ACTIVE)
				break;
		}

+		spin_unlock(&ce->guc_state.lock);
Instead of 'found' you can check the current entry pointer

		if (!list_entry_is_head(...))

And because requests can only be messed up with the guc_state itself, I think
you don't need to perform the above check under spinlock, so it's safe.
I'm not following the argument as to why it is safe to test a guc_state owned list outside of holding the guc_state spinlock.

I also think that having an explicit 'found' flag makes the code more readable and immediately obvious as to what is going on. For the sake of one bool (which the compiler would optimise out anyway), I don't think it is worth the obfuscation of behaviour and the risk of "I think this will work".

John.



+		if (found) {
  			intel_engine_set_hung_context(engine, ce);




[Index of Archives]     [AMD Graphics]     [Linux USB Devel]     [Linux Audio Users]     [Yosemite News]     [Linux Kernel]     [Linux SCSI]

  Powered by Linux