[PATCH] drm/i915: Ensure consistent control flow __i915_gem_active_get_rcu

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



This issue here is (I think) purely theoretical, since a compiler
would need to be especially foolish to recompute the value of
i915_gem_request_completed right after it was already used. Hence the
additional barrier() is also not really a restriction.

But I believe this to be at least permissible, and since our rcu
trickery is a beast it's worth to annotate all the corner cases.
Chris proposed to instead just wrap a READ_ONCE around
request->fence.seqno in i915_gem_request_completed. But that has a
measurable impact on code size, and everywhere we hold a full
reference to the underlying request it's also not needed. And
personally I'd like to have just enough barriers and locking needed
for correctness, but not more - it makes it much easier in the future
to understand what's going on.

Since the busy ioctl has now fully embraced it's races there's no
point annotating it there too. We really only need it in
active_get_rcu, since that function _must_ deliver a correct snapshot
of the active fences (and not chase something else).

v2: Polish the comment a bit more (Chris).

Cc: Chris Wilson <chris@xxxxxxxxxxxxxxxxxx>
Cc: Joonas Lahtinen <joonas.lahtinen@xxxxxxxxxxxxxxx>
Reviewed-by: Chris Wilson <chris@xxxxxxxxxxxxxxxxxx>
Signed-off-by: Daniel Vetter <daniel.vetter@xxxxxxxxx>
---
 drivers/gpu/drm/i915/i915_gem_request.h | 13 +++++++++++++
 1 file changed, 13 insertions(+)

diff --git a/drivers/gpu/drm/i915/i915_gem_request.h b/drivers/gpu/drm/i915/i915_gem_request.h
index 6c72bd8d9423..91014de8bfbc 100644
--- a/drivers/gpu/drm/i915/i915_gem_request.h
+++ b/drivers/gpu/drm/i915/i915_gem_request.h
@@ -472,6 +472,19 @@ __i915_gem_active_get_rcu(const struct i915_gem_active *active)
 		if (!request || i915_gem_request_completed(request))
 			return NULL;
 
+		/* An especially silly compiler could decide to recompute the
+		 * result of i915_gem_request_completed, more specifically
+		 * re-emit the load for request->fence.seqno. A race would catch
+		 * a later seqno value, which could flip the result from true to
+		 * false. Which means part of the instructions below might not
+		 * be executed, while later on instructions are executed. Due to
+		 * barriers within the refcounting the inconsistency can't reach
+		 * past the call to i915_gem_request_get_rcu, but not executing
+		 * that while still executing i915_gem_request_put() creates
+		 * havoc enough.  Prevent this with a compiler barrier.
+		 */
+		barrier();
+
 		request = i915_gem_request_get_rcu(request);
 
 		/* What stops the following rcu_access_pointer() from occurring
-- 
2.8.1

_______________________________________________
Intel-gfx mailing list
Intel-gfx@xxxxxxxxxxxxxxxxxxxxx
https://lists.freedesktop.org/mailman/listinfo/intel-gfx




[Index of Archives]     [Linux USB Devel]     [Linux Audio Users]     [Yosemite News]     [Linux Kernel]     [Linux SCSI]
  Powered by Linux