This is a note to let you know that I've just added the patch titled x86/sev: Fix kernel crash due to late update to read-only ghcb_version to the 6.1-stable tree which can be found at: http://www.kernel.org/git/?p=linux/kernel/git/stable/stable-queue.git;a=summary The filename of the patch is: x86-sev-fix-kernel-crash-due-to-late-update-to-read-.patch and it can be found in the queue-6.1 subdirectory. If you, or anyone else, feels it should not be added to the stable tree, please let <stable@xxxxxxxxxxxxxxx> know about it. commit d7460400c05eb1c2bc23de739bd6d2f1c9001e56 Author: Ashwin Dayanand Kamat <ashwin.kamat@xxxxxxxxxxxx> Date: Wed Nov 29 16:10:29 2023 +0530 x86/sev: Fix kernel crash due to late update to read-only ghcb_version [ Upstream commit 27d25348d42161837be08fc63b04a2559d2e781c ] A write-access violation page fault kernel crash was observed while running cpuhotplug LTP testcases on SEV-ES enabled systems. The crash was observed during hotplug, after the CPU was offlined and the process was migrated to different CPU. setup_ghcb() is called again which tries to update ghcb_version in sev_es_negotiate_protocol(). Ideally this is a read_only variable which is initialised during booting. Trying to write it results in a pagefault: BUG: unable to handle page fault for address: ffffffffba556e70 #PF: supervisor write access in kernel mode #PF: error_code(0x0003) - permissions violation [ ...] Call Trace: <TASK> ? __die_body.cold+0x1a/0x1f ? __die+0x2a/0x35 ? page_fault_oops+0x10c/0x270 ? setup_ghcb+0x71/0x100 ? __x86_return_thunk+0x5/0x6 ? search_exception_tables+0x60/0x70 ? __x86_return_thunk+0x5/0x6 ? fixup_exception+0x27/0x320 ? kernelmode_fixup_or_oops+0xa2/0x120 ? __bad_area_nosemaphore+0x16a/0x1b0 ? kernel_exc_vmm_communication+0x60/0xb0 ? bad_area_nosemaphore+0x16/0x20 ? do_kern_addr_fault+0x7a/0x90 ? exc_page_fault+0xbd/0x160 ? asm_exc_page_fault+0x27/0x30 ? setup_ghcb+0x71/0x100 ? setup_ghcb+0xe/0x100 cpu_init_exception_handling+0x1b9/0x1f0 The fix is to call sev_es_negotiate_protocol() only in the BSP boot phase, and it only needs to be done once in any case. [ mingo: Refined the changelog. ] Fixes: 95d33bfaa3e1 ("x86/sev: Register GHCB memory when SEV-SNP is active") Suggested-by: Tom Lendacky <thomas.lendacky@xxxxxxx> Co-developed-by: Bo Gan <bo.gan@xxxxxxxxxxxx> Signed-off-by: Bo Gan <bo.gan@xxxxxxxxxxxx> Signed-off-by: Ashwin Dayanand Kamat <ashwin.kamat@xxxxxxxxxxxx> Signed-off-by: Ingo Molnar <mingo@xxxxxxxxxx> Acked-by: Tom Lendacky <thomas.lendacky@xxxxxxx> Link: https://lore.kernel.org/r/1701254429-18250-1-git-send-email-kashwindayan@xxxxxxxxxx Signed-off-by: Sasha Levin <sashal@xxxxxxxxxx> diff --git a/arch/x86/kernel/sev.c b/arch/x86/kernel/sev.c index 68b2a9d3dbc6b..c8dfb0fdde7f9 100644 --- a/arch/x86/kernel/sev.c +++ b/arch/x86/kernel/sev.c @@ -1279,10 +1279,6 @@ void setup_ghcb(void) if (!cc_platform_has(CC_ATTR_GUEST_STATE_ENCRYPT)) return; - /* First make sure the hypervisor talks a supported protocol. */ - if (!sev_es_negotiate_protocol()) - sev_es_terminate(SEV_TERM_SET_GEN, GHCB_SEV_ES_GEN_REQ); - /* * Check whether the runtime #VC exception handler is active. It uses * the per-CPU GHCB page which is set up by sev_es_init_vc_handling(). @@ -1297,6 +1293,13 @@ void setup_ghcb(void) return; } + /* + * Make sure the hypervisor talks a supported protocol. + * This gets called only in the BSP boot phase. + */ + if (!sev_es_negotiate_protocol()) + sev_es_terminate(SEV_TERM_SET_GEN, GHCB_SEV_ES_GEN_REQ); + /* * Clear the boot_ghcb. The first exception comes in before the bss * section is cleared.