[RFC] drm/i915: Move the STC LRA eviction policy workaround after ring init.

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



Clearing bit 5 of CACHE_MODE_0 is necessary to prevent GPU hangs in
OpenGL programs such as Google MapsGL, Google Earth, and gzdoom.

While commit 09be664ecc77d58 introduced the workaround, the register
write didn't actually stick: a printk and I915_READ immediately after
the write would return the new value, but a second one would show that
it had reverted to the original value...even with no intervening code.

The hardware documentation mentions that the ring must be idle before
writing CACHE_MODE_0.  This provided a clue that it might be necessary
to initialize the rings before attempting to program the register.  Sure
enough, moving the write after ring initialization makes it stick.

Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=47535
Signed-off-by: Kenneth Graunke <kenneth at whitecape.org>
---
 drivers/gpu/drm/i915/i915_dma.c |    7 +++++++
 drivers/gpu/drm/i915/intel_pm.c |    4 ----
 2 files changed, 7 insertions(+), 4 deletions(-)

Here's a horrible patch---putting that one workaround bit directly inside 
i915_load_modeset_init is clearly a terrible idea.  There will obviously
be many more, and per-generation.  I suspect that many more of the
workaround bits we're setting in init_clock_gating may need to be moved
later in the init process too...but I haven't checked to see which ones
are failing to stick.

So I guess there's a couple questions:

* Does it make sense to move ALL the init_clock_gating stuff to this
  point in time?  Or are there some registers that need to be done early,
  where they are today?  (If we can move them all, we could just move the
  call to init_clock_gating and be done with it...)

  Apparently CACHE_MODE_0 is a context-specific register, while some others
  are not.  I like having all the workaround bits in one place, but that
  may or may not be feasible...

* Do we want to make an intel_apply_workarounds() function or such?
  Perhaps use a function pointer that gets filled in on a per-chipset basis?

* Is this the best time to set it?  Later?  Elsewhere?

* What should the code look like long-term, and what would be easiest for
  backporting to stable kernels?

diff --git a/drivers/gpu/drm/i915/i915_dma.c b/drivers/gpu/drm/i915/i915_dma.c
index e2983a9..532c8cf 100644
--- a/drivers/gpu/drm/i915/i915_dma.c
+++ b/drivers/gpu/drm/i915/i915_dma.c
@@ -1128,6 +1128,13 @@ static int i915_load_modeset_init(struct drm_device *dev)
 
 	intel_modeset_gem_init(dev);
 
+	/* Apply workaround bits.  These need to be done after ring init. */
+	if (IS_GEN6(dev)) {
+		/* clear masked bit */
+		I915_WRITE(CACHE_MODE_0,
+			   CM0_STC_EVICT_DISABLE_LRA_SNB << CM0_MASK_SHIFT);
+	}
+
 	ret = drm_irq_install(dev);
 	if (ret)
 		goto cleanup_gem;
diff --git a/drivers/gpu/drm/i915/intel_pm.c b/drivers/gpu/drm/i915/intel_pm.c
index 0552058..070ab27 100644
--- a/drivers/gpu/drm/i915/intel_pm.c
+++ b/drivers/gpu/drm/i915/intel_pm.c
@@ -2663,10 +2663,6 @@ static void gen6_init_clock_gating(struct drm_device *dev)
 	I915_WRITE(WM2_LP_ILK, 0);
 	I915_WRITE(WM1_LP_ILK, 0);
 
-	/* clear masked bit */
-	I915_WRITE(CACHE_MODE_0,
-		   CM0_STC_EVICT_DISABLE_LRA_SNB << CM0_MASK_SHIFT);
-
 	I915_WRITE(GEN6_UCGCTL1,
 		   I915_READ(GEN6_UCGCTL1) |
 		   GEN6_BLBUNIT_CLOCK_GATE_DISABLE |
-- 
1.7.10



[Index of Archives]     [Linux USB Devel]     [Linux Audio Users]     [Yosemite News]     [Linux Kernel]     [Linux SCSI]
  Powered by Linux