Re: [PATCH 3/8] drm/i915: Cope with request list state change during error state capture

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On 09/10/2015 09:28, Daniel Vetter wrote:
On Thu, Oct 08, 2015 at 07:31:35PM +0100, Tomas Elf wrote:
Since we're not synchronizing the ring request list during error state capture
the request list state might change between the time the corresponding error
request list was allocated and dimensioned to the time when the ring request
list is actually captured into the error state. If this happens, throw a
WARNING and do early exit and be aware that the captured error state might not
be fully reliable.

Please don't throw a WARNING since this is expected to occasionally
happen. DRM_DEBUG_DRIVER is enough imo.
-Daniel

I still don't see how it could happen without leading to reads of unallocated memory. The error state request list has been allocated to a certain size equal to num_requests and this loop seems to assume that the error state request list maintains the same size as the driver request list, which is not the case - leading to crashes, which is how I happened to notice it.

I can obviously remove the warning but are you saying we shouldn't even take action if it happens? Such as early exit?

Thanks,
Tomas



Signed-off-by: Tomas Elf <tomas.elf@xxxxxxxxx>
---
  drivers/gpu/drm/i915/i915_gpu_error.c | 13 +++++++++++++
  1 file changed, 13 insertions(+)

diff --git a/drivers/gpu/drm/i915/i915_gpu_error.c b/drivers/gpu/drm/i915/i915_gpu_error.c
index 32c1799..cc75ca4 100644
--- a/drivers/gpu/drm/i915/i915_gpu_error.c
+++ b/drivers/gpu/drm/i915/i915_gpu_error.c
@@ -1071,6 +1071,19 @@ static void i915_gem_record_rings(struct drm_device *dev,
  		list_for_each_entry_safe(request, tmpreq, &ring->request_list, list) {
  			struct drm_i915_error_request *erq;

+			if (WARN_ON(!request || count >= error->ring[i].num_requests)) {
+				/*
+				 * If the ring request list was changed in
+				 * between the point where the error request
+				 * list was created and dimensioned and this
+				 * point then just update the num_requests
+				 * field to reflect this.
+				 */
+				error->ring[i].num_requests =
+					min(count, error->ring[i].num_requests);
+				break;
+			}
+
  			erq = &error->ring[i].requests[count++];
  			erq->seqno = request->seqno;
  			erq->jiffies = request->emitted_jiffies;
--
1.9.1

_______________________________________________
Intel-gfx mailing list
Intel-gfx@xxxxxxxxxxxxxxxxxxxxx
http://lists.freedesktop.org/mailman/listinfo/intel-gfx


_______________________________________________
Intel-gfx mailing list
Intel-gfx@xxxxxxxxxxxxxxxxxxxxx
http://lists.freedesktop.org/mailman/listinfo/intel-gfx




[Index of Archives]     [Linux USB Devel]     [Linux Audio Users]     [Yosemite News]     [Linux Kernel]     [Linux SCSI]
  Powered by Linux