On 19/09/2019 18:49, Chris Wilson wrote:
Quoting Tvrtko Ursulin (2019-09-19 18:11:14)
On 19/09/2019 14:26, Chris Wilson wrote:
Quoting Tvrtko Ursulin (2019-09-19 14:02:19)
On 19/09/2019 12:19, Chris Wilson wrote:
+static struct intel_timeline *get_timeline(struct i915_request *rq)
+{
+ struct intel_timeline *tl;
+
+ /*
+ * Even though we are holding the engine->active.lock here, there
+ * is no control over the submission queue per-se and we are
+ * inspecting the active state at a random point in time, with an
+ * unknown queue. Play safe and make sure the timeline remains valid.
+ * (Only being used for pretty printing, one extra kref shouldn't
+ * cause a camel stampede!)
+ */
+ rcu_read_lock();
+ tl = rcu_dereference(rq->timeline);
+ if (!kref_get_unless_zero(&tl->kref))
+ tl = NULL;
+ rcu_read_unlock();
How can it be NULL under the active lock? Isn't that the same assertion
from i915_timeline_get_active.
Not NULL, but retired. The difference is that during submission we know
that this request's context/timeline must be currently pinned until
a subsequent request (containing the idle-barriers) is submitted. The
danger I worry about here is that subsequent idle request may be already
submitted and since the queued requests may *already* have been retired,
the timeline may be unpinned and indeed dropped it's last reference.
But here it is under the engine->active.lock with interrupts disabled
and the requests are fetched from execlists ports. Timeline is not
guaranteed to be kept alive under these conditions? intel_context
reference will be held until process_csb schedules it out so I'd expect
timeline and hwsp to be there. But I could be lost in the new scheme of
things.
I felt it was prudent to only rely on the active pin. You are right in
that we have a context reference if it is in active, and that context
holds a reference to the timeline. But... engine->active.lock is not
the lock that guards rq->timeline, so I feel uneasy on extending
i915_request_active_timeline() too far. Outside of the submission
pathway, inside a pretty printer, it feels safer (whatever changes may
come we don't have to worry about it) to not assume anything and just
use the failsafe rcu_dereference() + kref.
Well okay, I can accept that in the overall situation.
Reviewed-by: Tvrtko Ursulin <tvrtko.ursulin@xxxxxxxxx>
Regards,
Tvrtko
_______________________________________________
Intel-gfx mailing list
Intel-gfx@xxxxxxxxxxxxxxxxxxxxx
https://lists.freedesktop.org/mailman/listinfo/intel-gfx