Hi, On 11/1/20 6:41 PM, rwright@xxxxxxx wrote: > From: Randy Wright <rwright@xxxxxxx> > > For several months, I've been experiencing GPU hangs when starting > Cinnamon on an HP Pavilion Mini 300-020 if I try to run an upstream > kernel. I reported this recently in > https://gitlab.freedesktop.org/drm/intel/-/issues/2413 where I have > attached the requested evidence including the state collected from > /sys/class/drm/card0/error and debug output from dmesg. > > I ran a bisect to find the problem, which indicates this is the > troublesome commit: > > [47f8253d2b8947d79fd3196bf96c1959c0f25f20] drm/i915/gen7: Clear all EU/L3 residual contexts > > The nature of that commit suggested to me that reducing the > batch size used in the context clear operation might help this > relatively low-powered system to avoid the hang.... and it did! > I simply forced this system to take the smaller batch length that is > already used for non-Haswell systems. > > The first two versions of this patch were posted as RFC > patches to the Intel-gfx list, implementing the same > algorithmic change in function batch_get_defaults, > but without employing a properly constructed quirk. > > I've now cleaned up the patch to employ a new QUIRK_RENDERCLEAR_REDUCED. > The quirk is presently set only for the aforementioned HP Pavilion Mini > 300-020. The patch now touches three files to define the quirk, set it, > and then check for it in function batch_get_defaults. Note I'm not really an i915 dev. With that said I do wonder if we should not use the reduced batch size in a lot more cases, the machine in question uses a 3558U CPU if the iGPU of that CPU has this issue, then I would expect pretty much all Haswell U models (at a minimum) to have this issue. So solving this with a quirk for just the HP Pavilion Mini 300-020 seems wrong to me. I think we need a more generic way of enabling the reduced batch size. I even wonder if we should not simply use it everywhere. Since you do have a proper Haswell CPU, I guess it being an U model makes the hang easier to trigger, but I suspect the higher TPD ones may also still be susceptible ... Regards, Hans