On Tue, 2023-12-05 at 03:45 +0300, Kirill A. Shutemov wrote: > +static void tdx_kexec_unshare_mem(bool crash) > +{ > + unsigned long addr, end; > + long found = 0, shared; > + > + /* Stop new private<->shared conversions */ > + conversion_allowed = false; I wonder if this might need a compiler barrier here to be totally safe. I'm not sure. > + > + /* > + * Crash kernel reaches here with interrupts disabled: can't > wait for > + * conversions to finish. > + * > + * If race happened, just report and proceed. > + */ > + if (!crash) { > + unsigned long timeout; > + > + /* > + * Wait for in-flight conversions to complete. > + * > + * Do not wait more than 30 seconds. > + */ > + timeout = 30 * USEC_PER_SEC; > + while (atomic_read(&conversions_in_progress) && > timeout--) > + udelay(1); > + } > + > + if (atomic_read(&conversions_in_progress)) > + pr_warn("Failed to finish shared<->private > conversions\n"); I can't think of any non-ridiculous way to handle this case. Maybe we need VMM help. > > diff --git a/arch/x86/kernel/reboot.c b/arch/x86/kernel/reboot.c > index 830425e6d38e..c81afffaa954 100644 > --- a/arch/x86/kernel/reboot.c > +++ b/arch/x86/kernel/reboot.c > @@ -12,6 +12,7 @@ > #include <linux/delay.h> > #include <linux/objtool.h> > #include <linux/pgtable.h> > +#include <linux/kexec.h> > #include <acpi/reboot.h> > #include <asm/io.h> > #include <asm/apic.h> > @@ -31,6 +32,7 @@ > #include <asm/realmode.h> > #include <asm/x86_init.h> > #include <asm/efi.h> > +#include <asm/tdx.h> > > /* > * Power off function, if any > @@ -716,6 +718,14 @@ static void > native_machine_emergency_restart(void) > > void native_machine_shutdown(void) > { > + /* > + * Call enc_kexec_unshare_mem() while all CPUs are still > active and > + * interrupts are enabled. This will allow all in-flight > memory > + * conversions to finish cleanly before unsharing all memory. > + */ > + if (cc_platform_has(CC_ATTR_GUEST_MEM_ENCRYPT) && > kexec_in_progress) > + x86_platform.guest.enc_kexec_unshare_mem(false); These questions are coming from an incomplete understanding of the kexec/reboot operation. Please disregard if it is not helpful. By doing this while other tasks can still run, it handles the conversion races in the !crash case. But then it sets shared pages to NP. What happens if another active task tries to write to one? I guess we rely on the kernel_restart_prepare()->device_shutdown() to clean up, which runs before native_machine_shutdown(). So there might be conversions in progress when tdx_kexec_unshare_mem() is called, from the allocator work queues. But the actual memory won't be accessed during that operation. But the console must be active? Or otherwise who can see these warnings. It doesn't use a shared page? Or the KVM clock, which looks to clean up at cpu tear down, which now happens after tdx_kexec_unshare_mem()? So I wonder if there might be cases. If so, maybe you could halt the conversions in native_machine_shutdown(), then do the actual reset to private after tasks can't schedule. I'd still wonder about if anything might try to access a shared page triggered by the console output. > + > /* Stop the cpus and apics */ > #ifdef CONFIG_X86_IO_APIC > /* _______________________________________________ kexec mailing list kexec@xxxxxxxxxxxxxxxxxxx http://lists.infradead.org/mailman/listinfo/kexec