This is a note to let you know that I've just added the patch titled drm/i915: Avoid tweaking evaluation thresholds on Baytrail v3 to the 4.10-stable tree which can be found at: http://www.kernel.org/git/?p=linux/kernel/git/stable/stable-queue.git;a=summary The filename of the patch is: drm-i915-avoid-tweaking-evaluation-thresholds-on-baytrail-v3.patch and it can be found in the queue-4.10 subdirectory. If you, or anyone else, feels it should not be added to the stable tree, please let <stable@xxxxxxxxxxxxxxx> know about it. >From 34dc8993eef63681b062871413a9484008a2a78f Mon Sep 17 00:00:00 2001 From: Mika Kuoppala <mika.kuoppala@xxxxxxxxxxxxxxx> Date: Wed, 15 Feb 2017 15:52:59 +0200 Subject: drm/i915: Avoid tweaking evaluation thresholds on Baytrail v3 MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit From: Mika Kuoppala <mika.kuoppala@xxxxxxxxxxxxxxx> commit 34dc8993eef63681b062871413a9484008a2a78f upstream. Certain Baytrails, namely the 4 cpu core variants, have been plaqued by spurious system hangs, mostly occurring with light loads. Multiple bisects by various people point to a commit which changes the reclocking strategy for Baytrail to follow its bigger brethen: commit 8fb55197e64d ("drm/i915: Agressive downclocking on Baytrail") There is also a review comment attached to this commit from Deepak S on avoiding punit access on Cherryview and thus it was excluded on common reclocking path. By taking the same approach and omitting the punit access by not tweaking the thresholds when the hardware has been asked to move into different frequency, considerable gains in stability have been observed. With J1900 box, light render/video load would end up in system hang in usually less than 12 hours. With this patch applied, the cumulative uptime has now been 34 days without issues. To provoke system hang, light loads on both render and bsd engines in parallel have been used: glxgears >/dev/null 2>/dev/null & mpv --vo=vaapi --hwdec=vaapi --loop=inf vid.mp4 So far, author has not witnessed system hang with above load and this patch applied. Reports from the tenacious people at kernel bugzilla are also promising. Considering that the punit access frequency with this patch is considerably less, there is a possibility that this will push the, still unknown, root cause past the triggering point on most loads. But as we now can reliably reproduce the hang independently, we can reduce the pain that users are having and use a static thresholds until a root cause is found. v3: don't break debugfs and simplification (Chris Wilson) References: https://bugzilla.kernel.org/show_bug.cgi?id=109051 Cc: Chris Wilson <chris@xxxxxxxxxxxxxxxxxx> Cc: Ville Syrjälä <ville.syrjala@xxxxxxxxxxxxxxx> Cc: Len Brown <len.brown@xxxxxxxxx> Cc: Daniel Vetter <daniel.vetter@xxxxxxxx> Cc: Jani Nikula <jani.nikula@xxxxxxxxx> Cc: fritsch@xxxxxxxx Cc: miku@xxxxxx Cc: Ezequiel Garcia <ezequiel@xxxxxxxxxxxxxxxxxxxx> CC: Michal Feix <michal@xxxxxxx> Cc: Hans de Goede <hdegoede@xxxxxxxxxx> Cc: Deepak S <deepak.s@xxxxxxxxxxxxxxx> Cc: Jarkko Nikula <jarkko.nikula@xxxxxxxxxxxxxxx> Acked-by: Daniel Vetter <daniel.vetter@xxxxxxxx> Acked-by: Chris Wilson <chris@xxxxxxxxxxxxxxxxxx> Signed-off-by: Mika Kuoppala <mika.kuoppala@xxxxxxxxx> Link: http://patchwork.freedesktop.org/patch/msgid/1487166779-26945-1-git-send-email-mika.kuoppala@xxxxxxxxx (cherry picked from commit 6067a27d1f0184596d51decbac1c1fdc4acb012f) Signed-off-by: Jani Nikula <jani.nikula@xxxxxxxxx> Signed-off-by: Greg Kroah-Hartman <gregkh@xxxxxxxxxxxxxxxxxxx> --- drivers/gpu/drm/i915/intel_pm.c | 7 +++++++ 1 file changed, 7 insertions(+) --- a/drivers/gpu/drm/i915/intel_pm.c +++ b/drivers/gpu/drm/i915/intel_pm.c @@ -4876,6 +4876,12 @@ static void gen6_set_rps_thresholds(stru break; } + /* When byt can survive without system hang with dynamic + * sw freq adjustments, this restriction can be lifted. + */ + if (IS_VALLEYVIEW(dev_priv)) + goto skip_hw_write; + I915_WRITE(GEN6_RP_UP_EI, GT_INTERVAL_FROM_US(dev_priv, ei_up)); I915_WRITE(GEN6_RP_UP_THRESHOLD, @@ -4896,6 +4902,7 @@ static void gen6_set_rps_thresholds(stru GEN6_RP_UP_BUSY_AVG | GEN6_RP_DOWN_IDLE_AVG); +skip_hw_write: dev_priv->rps.power = new_power; dev_priv->rps.up_threshold = threshold_up; dev_priv->rps.down_threshold = threshold_down; Patches currently in stable-queue which might be from mika.kuoppala@xxxxxxxxxxxxxxx are queue-4.10/drm-i915-avoid-tweaking-evaluation-thresholds-on-baytrail-v3.patch