On Mon, Jul 17, 2023 at 10:57 PM Ard Biesheuvel <ardb@xxxxxxxxxx> wrote: > > On Mon, 17 Jul 2023 at 15:53, Tao Liu <ltao@xxxxxxxxxx> wrote: > > > > Hi Borislav, > > > > On Thu, Jul 13, 2023 at 6:05 PM Borislav Petkov <bp@xxxxxxxxx> wrote: > > > > > > On Thu, Jun 01, 2023 at 03:20:44PM +0800, Tao Liu wrote: > > > > arch/x86/kernel/machine_kexec_64.c | 35 ++++++++++++++++++++++++++---- > > > > 1 file changed, 31 insertions(+), 4 deletions(-) > > > > > > Ok, pls try this totally untested thing. > > > > > > Thx. > > > > > > --- > > > diff --git a/arch/x86/boot/compressed/sev.c b/arch/x86/boot/compressed/sev.c > > > index 09dc8c187b3c..fefe27b2af85 100644 > > > --- a/arch/x86/boot/compressed/sev.c > > > +++ b/arch/x86/boot/compressed/sev.c > > > @@ -404,13 +404,20 @@ void sev_enable(struct boot_params *bp) > > > if (bp) > > > bp->cc_blob_address = 0; > > > > > > + /* Check for the SME/SEV support leaf */ > > > + eax = 0x80000000; > > > + ecx = 0; > > > + native_cpuid(&eax, &ebx, &ecx, &edx); > > > + if (eax < 0x8000001f) > > > + return; > > > + > > > /* > > > * Setup/preliminary detection of SNP. This will be sanity-checked > > > * against CPUID/MSR values later. > > > */ > > > snp = snp_init(bp); > > > > > > - /* Check for the SME/SEV support leaf */ > > > + /* Recheck the SME/SEV support leaf */ > > > eax = 0x80000000; > > > ecx = 0; > > > native_cpuid(&eax, &ebx, &ecx, &edx); > > > > > Thanks a lot for the patch above! Sorry for the late response. I have > > compiled and tested it locally against 6.5.0-rc1, though it can pass > > the early stage of kexec kernel bootup, > > OK, so that proves that the cc_blob table access is the culprit here. > That still means that kexec on SEV is likely to explode in the exact > same way should anyone attempt that. > > > > however the kernel will panic > > occasionally later. The test machine is the one with Intel Atom > > x6425RE cpu which encountered the page fault issue of missing efi > > config table. > > > > Agree with Boris that this seems entirely unrelated. Agree, I will have a retest based on Boris's suggestions. > > > ...snip... > > [ 21.360763] nvme0n1: p1 p2 p3 > > [ 21.364207] igc 0000:03:00.0: PTM enabled, 4ns granularity > > [ 21.421097] pps pps1: new PPS source ptp1 > > [ 21.425396] igc 0000:03:00.0 (unnamed net_device) (uninitialized): PHC added > > [ 21.457005] igc 0000:03:00.0: 4.000 Gb/s available PCIe bandwidth > > (5.0 GT/s PCIe x1 link) > > [ 21.465210] igc 0000:03:00.0 eth1: MAC: ...snip... > > [ 21.473424] igc 0000:03:00.0 enp3s0: renamed from eth1 > > [ 21.479446] BUG: kernel NULL pointer dereference, address: 0000000000000008 > > [ 21.486405] #PF: supervisor read access in kernel mode > > [ 21.491519] mmc1: Failed to initialize a non-removable card > > [ 21.491538] #PF: error_code(0x0000) - not-present page > > [ 21.502229] PGD 0 P4D 0 > > [ 21.504773] Oops: 0000 [#1] PREEMPT SMP NOPTI > > [ 21.509133] CPU: 3 PID: 402 Comm: systemd-udevd Not tainted 6.5.0-rc1+ #1 > > [ 21.515905] Hardware name: ...snip... > > > Why are you snipping the hardware name? Sorry for the inconvenience here... The machine is borrowed from our partner, which may not be officially released to the market. I haven't discussed the legal issue with them. In addition, I think the stack trace is more useful, so I snipped the hardware name. Sorry about that... >