On Thu, Nov 10, 2022, Dave Hansen wrote: > On 11/10/22 16:03, Kyle Huey wrote: > > On Tue, Nov 8, 2022 at 10:23 AM Dave Hansen <dave.hansen@xxxxxxxxx> wrote: > BTW, I'd love to know if KVM *REALLY* depends on this. Unlikely, but nearly impossible to know for sure. Copy+pasting my response[1] to an earlier version. : Hrm, the current behavior has been KVM ABI for a very long time. : : It's definitely odd because all other components will be initialized due to their : bits being cleared in the header during kvm_load_guest_fpu(), and it probably : wouldn't cause problems in practice as most VMMs likely do "all or nothing" loads. : But, in theory, userspace could save/restore a subset of guest XSTATE and rely on : the kernel not overwriting guest PKRU when its bit is cleared in the header. : : All that said, I don't see any reason to force KVM to change at this time, it's : trivial enough to handle KVM's oddities while providing sane behavior for others. : Nullify the pointer in the guest path and then update copy_uabi_to_xstate() to : play nice with a NULL pointer, e.g. : : /* : * Nullify @vpkru to preserve its current value if PKRU's bit isn't set : * in the header. KVM's odd ABI is to leave PKRU untouched in this : * case (all other components are eventually re-initialized). : */ : if (!(kstate->regs.xsave.header.xfeatures & XFEATURE_MASK_PKRU)) : vpkru = NULL; : : return copy_uabi_from_kernel_to_xstate(kstate, ustate, vpkru); > It'd be nice to kill if not. I don't disagree, my hesitation is purely that doing so might subtly break userspace. That said, I'm 99.9% certain no traditional VMM, e.g. QEMU, is relying on this behavior, as doing KVM_SET_XSAVE with anything except the guest's xfeatures mask would corrupt guest XSAVE state for everything except PKRU. I.e. for all intents and purposes, a traditional VMM must do KVM_GET_SAVE => KVM_SET_XSAVE without touching the xfeatures mask. And for non-traditional usage of KVM, I would be quite surprised if any of those use cases utilize PKRU in the guest, let alone play games with KVM_{G,S}SET_XSAVE. So, I'm not completely opposed to "fixing" KVM's ABI, but it should be done as a separate patch that is tagged "KVM: x86:" and clearly states that it's changing KVM's ABI in a way that could theoretically break userspace. > >> Would something like this be more clear? > >> > >> if (hdr.xfeatures & XFEATURE_MASK_PKRU) { > >> struct pkru_state *xpkru; > >> > >> xpkru = __raw_xsave_addr(xsave, XFEATURE_PKRU); > >> *pkru = xpkru->pkru; > >> } else { > >> /* > >> * KVM may pass a NULL 'pkru' to indicate > >> * that it does not need PKRU updated. > >> */ > >> if (pkru) > >> *pkru = 0; > >> } > > > > Yeah, Sean Christopherson suggested this (with the else and if > > collapsed into a single level) when I submitted this previously. > > I generally agree with Sean, but he's also been guilty of an atrocity or > two over the years. :) Heh, just one or two? I'll call that a win. > While I generally like low levels of indentation I also think my version is > much more clear in this case. I've no objection to a standalone if. My suggestion[2] was in response to code that zeroed @pkru before the XFEATURE_MASK_PKRU check. if (pkru) *pkru = 0; if (hdr.xfeatures & XFEATURE_MASK_PKRU) { struct pkru_state *xpkru; xpkru = __raw_xsave_addr(xsave, XFEATURE_PKRU); *pkru = xpkru->pkru; } [1] https://lore.kernel.org/all/Yv6szXuKGv75wWmm@xxxxxxxxxx [2] https://lore.kernel.org/all/YxDP6jie4cwzZIHp@xxxxxxxxxx