On 08/12/2022 1:55 pm, Marek Marczykowski-Górecki wrote: > Hi, > > There is an issue with i915 on Xen PV (dom0). The end result is a lot of > glitches, like here: https://openqa.qubes-os.org/tests/54748#step/startup/8 > (this one is on ADL, Linux 6.1-rc7 as a Xen PV dom0). It's using Xorg > with "modesetting" driver. > > After some iterations of debugging, we narrowed it down to i915 handling > caching. The main difference is that PAT is setup differently on Xen PV > than on native Linux. Normally, Linux does have appropriate abstraction > for that, but apparently something related to i915 doesn't play well > with it. The specific difference is: > native linux: > x86/PAT: Configuration [0-7]: WB WC UC- UC WB WP UC- WT > xen pv: > x86/PAT: Configuration [0-7]: WB WT UC- UC WC WP UC UC > ~~ ~~ ~~ ~~ > > The specific impact depends on kernel version and the hardware. The most > severe issues I see on >=ADL, but some older hardware is affected too - > sometimes only if composition is disabled in the window manager. > Some more information is collected at > https://github.com/QubesOS/qubes-issues/issues/4782 (and few linked > duplicates...). > > Kind-of related commit is here: > https://github.com/torvalds/linux/commit/bdd8b6c98239cad ("drm/i915: > replace X86_FEATURE_PAT with pat_enabled()") - it is the place where > i915 explicitly checks for PAT support, so I'm cc-ing people mentioned > there too. > > Any ideas? > > The issue can be easily reproduced without Xen too, by adjusting PAT in > Linux: > -----8<----- > diff --git a/arch/x86/mm/pat/memtype.c b/arch/x86/mm/pat/memtype.c > index 66a209f7eb86..319ab60c8d8c 100644 > --- a/arch/x86/mm/pat/memtype.c > +++ b/arch/x86/mm/pat/memtype.c > @@ -400,8 +400,8 @@ void pat_init(void) > * The reserved slots are unused, but mapped to their > * corresponding types in the presence of PAT errata. > */ > - pat = PAT(0, WB) | PAT(1, WC) | PAT(2, UC_MINUS) | PAT(3, UC) | > - PAT(4, WB) | PAT(5, WP) | PAT(6, UC_MINUS) | PAT(7, WT); > + pat = PAT(0, WB) | PAT(1, WT) | PAT(2, UC_MINUS) | PAT(3, UC) | > + PAT(4, WC) | PAT(5, WP) | PAT(6, UC) | PAT(7, UC); > } > > if (!pat_bp_initialized) { > -----8<----- > Hello, can anyone help please? Intel's CI has taken this reproducer of the bug, and confirmed the regression. https://lore.kernel.org/intel-gfx/Y5Hst0bCxQDTN7lK@mail-itl/T/#m4480c15a0d117dce6210562eb542875e757647fb We're reasonably confident that it is an i915 bug (given the repro with no Xen in the mix), but we're out of any further ideas. Thanks, ~Andrew