On 7/14/2022 10:53 PM, Chuck Zmudzinski wrote: > On 7/14/2022 10:19 PM, Chuck Zmudzinski wrote: > > On 7/14/2022 1:40 AM, Juergen Gross wrote: > > > On 13.07.22 03:36, Chuck Zmudzinski wrote: > > > > The commit 99c13b8c8896d7bcb92753bf > > > > ("x86/mm/pat: Don't report PAT on CPUs that don't support it") > > > > incorrectly failed to account for the case in init_cache_modes() when > > > > CPUs do support PAT and falsely reported PAT to be disabled when in > > > > fact PAT is enabled. In some environments, notably in Xen PV domains, > > > > MTRR is disabled but PAT is still enabled, and that is the case > > > > that the aforementioned commit failed to account for. > > > > > > > > As an unfortunate consequnce, the pat_enabled() function currently does > > > > not correctly report that PAT is enabled in such environments. The fix > > > > is implemented in init_cache_modes() by setting pat_bp_enabled to true > > > > in init_cache_modes() for the case that commit 99c13b8c8896d7bcb92753bf > > > > ("x86/mm/pat: Don't report PAT on CPUs that don't support it") failed > > > > to account for. > > > > > > > > This approach arranges for pat_enabled() to return true in the Xen PV > > > > environment without undermining the rest of PAT MSR management logic > > > > that considers PAT to be disabled: Specifically, no writes to the PAT > > > > MSR should occur. > > > > > > > > This patch fixes a regression that some users are experiencing with > > > > Linux as a Xen Dom0 driving particular Intel graphics devices by > > > > correctly reporting to the Intel i915 driver that PAT is enabled where > > > > previously it was falsely reporting that PAT is disabled. Some users > > > > are experiencing system hangs in Xen PV Dom0 and all users on Xen PV > > > > Dom0 are experiencing reduced graphics performance because the keying of > > > > the use of WC mappings to pat_enabled() (see arch_can_pci_mmap_wc()) > > > > means that in particular graphics frame buffer accesses are quite a bit > > > > less performant than possible without this patch. > > > > > > > > Also, with the current code, in the Xen PV environment, PAT will not be > > > > disabled if the administrator sets the "nopat" boot option. Introduce > > > > a new boolean variable, pat_force_disable, to forcibly disable PAT > > > > when the administrator sets the "nopat" option to override the default > > > > behavior of using the PAT configuration that Xen has provided. > > > > > > > > For the new boolean to live in .init.data, init_cache_modes() also needs > > > > moving to .init.text (where it could/should have lived already before). > > > > > > > > Fixes: 99c13b8c8896d7bcb92753bf ("x86/mm/pat: Don't report PAT on CPUs that don't support it") > > > > Co-developed-by: Jan Beulich <jbeulich@xxxxxxxx> > > > > Cc: stable@xxxxxxxxxxxxxxx > > > > Signed-off-by: Chuck Zmudzinski <brchuckz@xxxxxxx> > > > > --- > > > > v2: *Add force_pat_disabled variable to fix "nopat" on Xen PV (Jan Beulich) > > > > *Add the necessary code to incorporate the "nopat" fix > > > > *void init_cache_modes(void) -> void __init init_cache_modes(void) > > > > *Add Jan Beulich as Co-developer (Jan has not signed off yet) > > > > *Expand the commit message to include relevant parts of the commit > > > > message of Jan Beulich's proposed patch for this problem > > > > *Fix 'else if ... {' placement and indentation > > > > *Remove indication the backport to stable branches is only back to 5.17.y > > > > > > > > I think these changes address all the comments on the original patch > > > > > > > > I added Jan Beulich as a Co-developer because Juergen Gross asked me to > > > > include Jan's idea for fixing "nopat" that was missing from the first > > > > version of the patch. > > > > > > > > The patch has been tested, it works as expected with and without nopat > > > > in the Xen PV Dom0 environment. That is, "nopat" causes the system to > > > > exhibit the effects and problems that lack of PAT support causes. > > > > > > > > arch/x86/mm/pat/memtype.c | 16 ++++++++++++++-- > > > > 1 file changed, 14 insertions(+), 2 deletions(-) > > > > > > > > diff --git a/arch/x86/mm/pat/memtype.c b/arch/x86/mm/pat/memtype.c > > > > index d5ef64ddd35e..10a37d309d23 100644 > > > > --- a/arch/x86/mm/pat/memtype.c > > > > +++ b/arch/x86/mm/pat/memtype.c > > > > @@ -62,6 +62,7 @@ > > > > > > > > static bool __read_mostly pat_bp_initialized; > > > > static bool __read_mostly pat_disabled = !IS_ENABLED(CONFIG_X86_PAT); > > > > +static bool __initdata pat_force_disabled = !IS_ENABLED(CONFIG_X86_PAT); > > > > static bool __read_mostly pat_bp_enabled; > > > > static bool __read_mostly pat_cm_initialized; > > > > > > > > @@ -86,6 +87,7 @@ void pat_disable(const char *msg_reason) > > > > static int __init nopat(char *str) > > > > { > > > > pat_disable("PAT support disabled via boot option."); > > > > + pat_force_disabled = true; > > > > return 0; > > > > } > > > > early_param("nopat", nopat); > > > > @@ -272,7 +274,7 @@ static void pat_ap_init(u64 pat) > > > > wrmsrl(MSR_IA32_CR_PAT, pat); > > > > } > > > > > > > > -void init_cache_modes(void) > > > > +void __init init_cache_modes(void) > > > > { > > > > u64 pat = 0; > > > > > > > > @@ -292,7 +294,7 @@ void init_cache_modes(void) > > > > rdmsrl(MSR_IA32_CR_PAT, pat); > > > > } > > > > > > > > - if (!pat) { > > > > + if (!pat || pat_force_disabled) { > > > > > > Can we just remove this modification and ... > > > > > > > /* > > > > * No PAT. Emulate the PAT table that corresponds to the two > > > > * cache bits, PWT (Write Through) and PCD (Cache Disable). > > > > @@ -313,6 +315,16 @@ void init_cache_modes(void) > > > > */ > > > > pat = PAT(0, WB) | PAT(1, WT) | PAT(2, UC_MINUS) | PAT(3, UC) | > > > > PAT(4, WB) | PAT(5, WT) | PAT(6, UC_MINUS) | PAT(7, UC); > > > > + } else if (!pat_bp_enabled) { > > > > > > ... use > > > > > > + } else if (!pat_bp_enabled && !pat_force_disabled) { > > > > > > here? > > > > > > This will result in the desired outcome in all cases IMO: If PAT wasn't > > > disabled via "nopat" and the PAT MSR has a non-zero value (from BIOS or > > > Hypervisor) and PAT has been disabled implicitly (e.g. due to lack of > > > MTRR), then PAT will be set to "enabled" again. > > > > With that, you can also completely remove the new Boolean - it > > will be a meaningless variable wasting memory. This will also make > > my patch more or less do what Jan's patch does - the "nopat" option > > will not cause the situation when the PAT MSR does not match the > > software view. So you are basically proposing just going back to > > my original patch, after fixing the style problems, of course. That > > also would solve the problem of needing Jan's S-o-b. I am inclined, > > however, to wait for a maintainer who has power to actually do the > > commit, to make a comment. Your R-b to my v2 did not have much clout > > with the actual maintainers, as far as I can tell. I am somewhat annoyed > > that it was at your suggestion that my v2 ended up confusing the > > main issue, the regression, with the red herring of the "nopat" > > option. > > > > Chuck > > Actually, what your change does depend on keeping > pat_force_disable, but after all the discussion and > further thinking about this, I would prefer that you > give a R-b to v3 as simply my original patch with the > style fixed. I think it is wrong to confuse the regression > with the "nopat" issue. If you and Jan want to do a patch > for the "nopat" issue, that is your decision. I am not interested > in that. I am interested in fixing the regression. Also, I am > not included to formally submit v3 until Dave, Andy, Boris, or > someone else with more clout here on Linux expresses > interest in giving this idea an R-b. I meant to say "i am not inclined..." not I am not included..." Sorry for the confusion, Chuck > > > > > > > > > + /* > > > > + * In some environments, specifically Xen PV, PAT > > > > + * initialization is skipped because MTRRs are disabled even > > > > + * though PAT is available. In such environments, set PAT to > > > > + * enabled to correctly indicate to callers of pat_enabled() > > > > + * that CPU support for PAT is available. > > > > + */ > > > > + pat_bp_enabled = true; > > > > + pr_info("x86/PAT: PAT enabled by init_cache_modes\n"); > > > > } > > > > > > > > __init_cache_modes(pat); > > > > > > > > > Juergen > > > > >