Re: [PATCH 15/17] drm/i915: Increase GuC log buffer size to reduce flush interrupts

Tvrtko Ursulin <tvrtko.ursulin@xxxxxxxxxxxxxxx> · Mon, 18 Jul 2016 10:54:26 +0100

On 15/07/16 17:20, Goel, Akash wrote:
On 7/15/2016 8:37 PM, Tvrtko Ursulin wrote:
On 15/07/16 15:42, Goel, Akash wrote:
On 7/15/2016 5:27 PM, Tvrtko Ursulin wrote:

On 10/07/16 14:41, akash.goel@xxxxxxxxx wrote:
From: Akash Goel <akash.goel@xxxxxxxxx>

In cases where GuC generate logs at a very high rate, correspondingly
the rate of flush interrupts is also very high.
So far total 8 pages were allocated for storing both ISR & DPC logs.
As per the half-full draining protocol followed by GuC, by doubling
the number of pages, the frequency of flush interrupts can be cut down
to almost half, which then helps in reducing the logging overhead.
So now allocating 8 pages apiece for ISR & DPC logs.

Suggested-by: Tvrtko Ursulin <tvrtko.ursulin@xxxxxxxxx>
Signed-off-by: Akash Goel <akash.goel@xxxxxxxxx>
---
  drivers/gpu/drm/i915/intel_guc_fwif.h | 8 ++++----
  1 file changed, 4 insertions(+), 4 deletions(-)

diff --git a/drivers/gpu/drm/i915/intel_guc_fwif.h
b/drivers/gpu/drm/i915/intel_guc_fwif.h
index 1de6928..7521ed5 100644
--- a/drivers/gpu/drm/i915/intel_guc_fwif.h
+++ b/drivers/gpu/drm/i915/intel_guc_fwif.h
@@ -104,9 +104,9 @@
  #define   GUC_LOG_ALLOC_IN_MEGABYTE    (1 << 3)
  #define   GUC_LOG_CRASH_PAGES        1
  #define   GUC_LOG_CRASH_SHIFT        4
-#define   GUC_LOG_DPC_PAGES        3
+#define   GUC_LOG_DPC_PAGES        7
  #define   GUC_LOG_DPC_SHIFT        6
-#define   GUC_LOG_ISR_PAGES        3
+#define   GUC_LOG_ISR_PAGES        7
  #define   GUC_LOG_ISR_SHIFT        9
  #define   GUC_LOG_BUF_ADDR_SHIFT    12

@@ -436,9 +436,9 @@ enum guc_log_buffer_type {
   *        |   Crash dump state header     |
   * Page1  +-------------------------------+
   *        |           ISR logs            |
- * Page5  +-------------------------------+
- *        |           DPC logs            |
   * Page9  +-------------------------------+
+ *        |           DPC logs            |
+ * Page17 +-------------------------------+
   *        |         Crash Dump logs       |
   *        +-------------------------------+
   *


I don't mind - but does it help? And how much and for what? Haven't you
later found that the uncached reads were the main issue?
This change along with kthread patch, helped reduce the overflow counts
and even eliminate them for some benchmarks.

Though with the impending optimization for Uncached reads there should
be further improvements but in my view, notwithstanding the improvement
w.r.t overflow count, its still a better configuration to work with as
flush interrupt frequency is cut down to half and not able to see any
apparent downsides to it.

I was primarily thinking to go with a minimal and simplest set of
patches to implement the feature.

I second that and working with the same intent.

Logic was that apparently none of the smart and complex optimisations
managed to solve the dropped interrupt issue, until the slowness of the
uncached read was discovered to be the real/main issue.

So it seems that is something that definitely needs to be implemented.
(Whether or not it will be possible to use SSE instructions to do the
read I don't know.)


log buffer resizing and rt priority kthread changes have definitely
helped significantly.

Only of late we realized that there is a potential way to speed up
Uncached reads also. Moreover I am yet to test that on kernel side.
So until that is tested & proves to be enough, we have to rely on the
other optimizations & can't dismiss them

Maybe, depends if, what I thought was the case, none of the other 
optimizations actually enabled a drop-free logging in all interesting 
scenarios.

If we conclude that simply improving the copy speed removes the need for 
any other optimisations and complications, we can talk about whether 
every individual one of those still makes sense.

Assuming it is possible, then the question is whether there is need for
all the other optimisations. Ie. do we need the kthread with rtprio or
would a simple worker be enough?
I think we can take a call, once we have the results with Uncached read
optimization.

Agreed. Lets see how that works out and the discuss on how the final 
series should look like.

Regards,

Tvrtko
_______________________________________________
Intel-gfx mailing list
Intel-gfx@xxxxxxxxxxxxxxxxxxxxx
https://lists.freedesktop.org/mailman/listinfo/intel-gfx