Re: [PATCH 1/3] drm/i915/perf: rework aging tail workaround

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On Sat, Mar 21, 2020 at 04:26:42PM -0700, Dixit, Ashutosh wrote:
On Thu, 19 Mar 2020 15:52:01 -0700, Umesh Nerlige Ramappa wrote:

From: Lionel Landwerlin <lionel.g.landwerlin@xxxxxxxxx>

We're about to introduce an options to open the perf stream, giving
the user ability to configure how often it wants the kernel to poll
the OA registers for available data.

Right now the workaround against the OA tail pointer race condition
requires at least twice the internal kernel polling timer to make any
data available.

This changes introduce checks on the OA data written into the circular
buffer to make as much data as possible available on the first
iteration of the polling timer.

/snip/

diff --git a/drivers/gpu/drm/i915/i915_perf.c b/drivers/gpu/drm/i915/i915_perf.c
index 3222f6cd8255..c1429d3acaf9 100644
--- a/drivers/gpu/drm/i915/i915_perf.c
+++ b/drivers/gpu/drm/i915/i915_perf.c
@@ -223,26 +223,17 @@
  *
  * Although this can be observed explicitly while copying reports to userspace
  * by checking for a zeroed report-id field in tail reports, we want to account
- * for this earlier, as part of the oa_buffer_check to avoid lots of redundant
- * read() attempts.
- *
- * In effect we define a tail pointer for reading that lags the real tail
- * pointer by at least %OA_TAIL_MARGIN_NSEC nanoseconds, which gives enough
- * time for the corresponding reports to become visible to the CPU.
- *
- * To manage this we actually track two tail pointers:
- *  1) An 'aging' tail with an associated timestamp that is tracked until we
- *     can trust the corresponding data is visible to the CPU; at which point
- *     it is considered 'aged'.
- *  2) An 'aged' tail that can be used for read()ing.
- *
- * The two separate pointers let us decouple read()s from tail pointer aging.
- *
- * The tail pointers are checked and updated at a limited rate within a hrtimer
- * callback (the same callback that is used for delivering EPOLLIN events)
- *
- * Initially the tails are marked invalid with %INVALID_TAIL_PTR which
- * indicates that an updated tail pointer is needed.
+ * for this earlier, as part of the oa_buffer_check_unlocked to avoid lots of
+ * redundant read() attempts.
+ *
+ * We workaround this issue in oa_buffer_check_unlocked() by reading the reports
+ * in the OA buffer, starting from the tail reported by the HW until we find 2
+ * consecutive reports with their first 2 dwords of not at 0. Those dwords are

until we find a report with its first 2 dwords not 0 meaning its previous
report is completely in memory and ready to be read.

+ * also set to 0 once read and the whole buffer is cleared upon OA buffer
+ * initialization. The first dword is the reason for this report while the
+ * second is the timestamp, making the chances of having those 2 fields at 0
+ * fairly unlikely. A more detailed explanation is available in
+ * oa_buffer_check_unlocked().

@@ -477,16 +468,6 @@ static bool oa_buffer_check_unlocked(struct i915_perf_stream *stream)
	 */
	spin_lock_irqsave(&stream->oa_buffer.ptr_lock, flags);

	hw_tail = stream->perf->ops.oa_hw_tail_read(stream);

	hw_tail &= ~(report_size - 1);

@@ -496,64 +477,64 @@ static bool oa_buffer_check_unlocked(struct i915_perf_stream *stream)

	now = ktime_get_mono_fast_ns();

+	if (hw_tail == stream->oa_buffer.aging_tail &&
+	   (now - stream->oa_buffer.aging_timestamp) > OA_TAIL_MARGIN_NSEC) {
+		/* If the HW tail hasn't move since the last check and the HW
+		 * tail has been aging for long enough, declare it the new
+		 * tail.
+		 */
+		stream->oa_buffer.tail = stream->oa_buffer.aging_tail;
+	} else {
+		u32 head, tail;

+		/* NB: The head we observe here might effectively be a little
+		 * out of date. If a read() is in progress, the head could be
+		 * anywhere between this head and stream->oa_buffer.tail.
+		 */
+		head = stream->oa_buffer.head - gtt_offset;

+		hw_tail -= gtt_offset;
+		tail = hw_tail;

+		/* Walk the stream backward until we find a report with dword 0
+		 * & 1 not at 0. Since the circular buffer pointers progress by
+		 * increments of 64 bytes and that reports can be up to 256
+		 * bytes long, we can't tell whether a report has fully landed
+		 * in memory before the first 2 dwords of the following report
+		 * have effectively landed.
+		 *
+		 * This is assuming that the writes of the OA unit land in
+		 * memory in the order they were written to.
+		 * If not : (╯°□°)╯︵ ┻━┻
		 */
-		if (hw_tail >= gtt_offset &&
-		    hw_tail < (gtt_offset + OA_BUFFER_SIZE)) {
-			stream->oa_buffer.tails[!aged_idx].offset =
-				aging_tail = hw_tail;
-			stream->oa_buffer.aging_timestamp = now;
-		} else {
-			drm_err(&stream->perf->i915->drm,
-				"Ignoring spurious out of range OA buffer tail pointer = %x\n",
-				hw_tail);
+		while (OA_TAKEN(tail, head) >= report_size) {
+			u32 previous_tail = (tail - report_size) & (OA_BUFFER_SIZE - 1);
+			u32 *report32 = (void *)(stream->oa_buffer.vaddr + previous_tail);

Sorry, this is wrong. This should just be:

			tail = (tail - report_size) & (OA_BUFFER_SIZE - 1);
			report32 = (void *)(stream->oa_buffer.vaddr + tail);

Otherwise when we break out of the loop below tail is still set one
report_size ahead. previous_tail is not needed. (In the previous version of
the patch this used to work out correctly).

oh, that's right. tail should point to the first report that has non-zero dwords. everything before that should be considered landed.
Thanks,
Umesh


+
+			/* Head of the report indicated by the HW tail register has
+			 * indeed landed into memory.
+			 */
+			if (report32[0] != 0 || report32[1] != 0)
+				break;
+
+			tail = previous_tail;
		}
+
+		if (((tail - hw_tail) & (OA_BUFFER_SIZE - 1)) > report_size &&

nit: OA_TAKEN(hw_tail, tail) > report_size?

+		    __ratelimit(&stream->perf->tail_pointer_race))
+			DRM_NOTE("unlanded report(s) head=0x%x "
+				 "tail=0x%x hw_tail=0x%x\n",
+				 head, tail, hw_tail);
+
+		stream->oa_buffer.tail = gtt_offset + tail;
+		stream->oa_buffer.aging_tail = gtt_offset + hw_tail;
+		stream->oa_buffer.aging_timestamp = now;
	}

	spin_unlock_irqrestore(&stream->oa_buffer.ptr_lock, flags);

-	return aged_tail == INVALID_TAIL_PTR ?
-		false : OA_TAKEN(aged_tail, head) >= report_size;
+	return OA_TAKEN(stream->oa_buffer.tail - gtt_offset,
+			stream->oa_buffer.head - gtt_offset) >= report_size;
 }

@@ -303,6 +292,12 @@ struct i915_perf_stream {
		 * OA buffer data to userspace.
		 */
		u32 head;
+
+		/**
+		 * @tail: The last tail verified tail that can be read by

The last verified tail
_______________________________________________
Intel-gfx mailing list
Intel-gfx@xxxxxxxxxxxxxxxxxxxxx
https://lists.freedesktop.org/mailman/listinfo/intel-gfx




[Index of Archives]     [AMD Graphics]     [Linux USB Devel]     [Linux Audio Users]     [Yosemite News]     [Linux Kernel]     [Linux SCSI]

  Powered by Linux