On Wed, Jan 25, 2017 at 03:09:04PM +0200, Mika Kuoppala wrote: > Chris Wilson <chris@xxxxxxxxxxxxxxxxxx> writes: > > > On Wed, Jan 25, 2017 at 02:31:08PM +0200, Mika Kuoppala wrote: > >> Certain Baytrails, namely the 4 cpu core variants, have been > >> plaqued by spurious system hangs, mostly occurring with light loads. > >> > >> Multiple bisects by various people point to a commit which changes the > >> reclocking strategy for Baytrail to follow its bigger brethen: > >> commit 8fb55197e64d ("drm/i915: Agressive downclocking on Baytrail") > >> > >> There is also a review comment attached to this commit from Deepak S > >> on avoiding punit access on Cherryview and thus it is excluded on > >> common reclocking path. By taking the same approach and omitting > >> the punit access by not tweaking the thresholds when the hardware > >> has been asked to move into different frequency, considerable gains > >> in stability have been observed. > >> > >> With J1900 box, light render/video load would end up in system hang > >> in usually less than 12 hours. With this patch applied, the cumulative > >> uptime has now been 34 days without issues. To provoke system hang, > >> light loads on both render and bsd engines in parallel have been used: > >> glxgears >/dev/null 2>/dev/null & > >> mpv --vo=vaapi --hwdec=vaapi --loop=inf vid.mp4 > >> > >> So far, author has not witnessed system hang with above load > >> and this patch applied. Reports from the tenacious people at > >> kernel bugzilla are also promising. > >> > >> Considering that the punit access frequency with this patch is > >> considerably less, there is a possibility that this will push > >> the, still unknown, root cause past the triggering point on most loads. > >> Further work on investigating the punit accesses on byt is welcomed. > > > > Please find the underlying problem and not disabling rps for all vlv > > for a GT specific problem. > > This is not disabling rps. Your are disabling the key ingredients of the algorithm, making it less generic in order to workaround a problem elsewhere. You are tackling the symptoms and not the cause. -Chris -- Chris Wilson, Intel Open Source Technology Centre -- To unsubscribe from this list: send the line "unsubscribe stable" in the body of a message to majordomo@xxxxxxxxxxxxxxx More majordomo info at http://vger.kernel.org/majordomo-info.html