Clearing bit 5 of CACHE_MODE_0 is necessary to prevent GPU hangs in OpenGL programs such as Google MapsGL, Google Earth, and gzdoom. While commit 09be664ecc77d58 introduced the workaround, the register write didn't actually stick: a printk and I915_READ immediately after the write would return the new value, but a second one would show that it had reverted to the original value...even with no intervening code. The hardware documentation mentions that the ring must be idle before writing CACHE_MODE_0. This provided a clue that it might be necessary to initialize the rings before attempting to program the register. Sure enough, moving the write after ring initialization makes it stick. Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=47535 Signed-off-by: Kenneth Graunke <kenneth at whitecape.org> --- drivers/gpu/drm/i915/i915_dma.c | 7 +++++++ drivers/gpu/drm/i915/intel_pm.c | 4 ---- 2 files changed, 7 insertions(+), 4 deletions(-) Here's a horrible patch---putting that one workaround bit directly inside i915_load_modeset_init is clearly a terrible idea. There will obviously be many more, and per-generation. I suspect that many more of the workaround bits we're setting in init_clock_gating may need to be moved later in the init process too...but I haven't checked to see which ones are failing to stick. So I guess there's a couple questions: * Does it make sense to move ALL the init_clock_gating stuff to this point in time? Or are there some registers that need to be done early, where they are today? (If we can move them all, we could just move the call to init_clock_gating and be done with it...) Apparently CACHE_MODE_0 is a context-specific register, while some others are not. I like having all the workaround bits in one place, but that may or may not be feasible... * Do we want to make an intel_apply_workarounds() function or such? Perhaps use a function pointer that gets filled in on a per-chipset basis? * Is this the best time to set it? Later? Elsewhere? * What should the code look like long-term, and what would be easiest for backporting to stable kernels? diff --git a/drivers/gpu/drm/i915/i915_dma.c b/drivers/gpu/drm/i915/i915_dma.c index e2983a9..532c8cf 100644 --- a/drivers/gpu/drm/i915/i915_dma.c +++ b/drivers/gpu/drm/i915/i915_dma.c @@ -1128,6 +1128,13 @@ static int i915_load_modeset_init(struct drm_device *dev) intel_modeset_gem_init(dev); + /* Apply workaround bits. These need to be done after ring init. */ + if (IS_GEN6(dev)) { + /* clear masked bit */ + I915_WRITE(CACHE_MODE_0, + CM0_STC_EVICT_DISABLE_LRA_SNB << CM0_MASK_SHIFT); + } + ret = drm_irq_install(dev); if (ret) goto cleanup_gem; diff --git a/drivers/gpu/drm/i915/intel_pm.c b/drivers/gpu/drm/i915/intel_pm.c index 0552058..070ab27 100644 --- a/drivers/gpu/drm/i915/intel_pm.c +++ b/drivers/gpu/drm/i915/intel_pm.c @@ -2663,10 +2663,6 @@ static void gen6_init_clock_gating(struct drm_device *dev) I915_WRITE(WM2_LP_ILK, 0); I915_WRITE(WM1_LP_ILK, 0); - /* clear masked bit */ - I915_WRITE(CACHE_MODE_0, - CM0_STC_EVICT_DISABLE_LRA_SNB << CM0_MASK_SHIFT); - I915_WRITE(GEN6_UCGCTL1, I915_READ(GEN6_UCGCTL1) | GEN6_BLBUNIT_CLOCK_GATE_DISABLE | -- 1.7.10