On Wed, Aug 31, 2022 at 4:16 PM Chang S. Bae <chang.seok.bae@xxxxxxxxx> wrote: > > On 8/25/2022 8:31 AM, Evan Green wrote: > > > > Here's the log I've got that pointed me down this path: > > https://pastebin.com/VvR1EHvE > > <3>[43486.261583] x86/keylocker: The key backup access failed with > read error. > <3>[43486.261584] x86/keylocker: Failed to restore internal > wrapping key. > > Looks like the IWKey backup was corrupted on that system. > > > Relevant bit pasted below: > > > > <6>[43486.263035] Enabling non-boot CPUs ... > > <6>[43486.263081] x86: Booting SMP configuration: > > <6>[43486.263082] smpboot: Booting Node 0 Processor 1 APIC 0x1 > > <2>[43486.264010] kernel tried to execute NX-protected page - exploit > > attempt? (uid: 0) > > <1>[43486.264019] BUG: unable to handle page fault for address: ffffffff94b483a6 > > <1>[43486.264021] #PF: supervisor instruction fetch in kernel mode > > <1>[43486.264023] #PF: error_code(0x0011) - permissions violation > > <6>[43486.264025] PGD 391c0e067 P4D 391c0e067 PUD 391c0f063 PMD > > 10006c063 PTE 8000000392148163 > > <4>[43486.264031] Oops: 0011 [#1] PREEMPT SMP NOPTI > > <4>[43486.264035] CPU: 1 PID: 0 Comm: swapper/1 Tainted: G U > > 5.10.136-19391-gadfe4d4b8c04 #1 > > b640352a7a0e5f1522aed724296ad63f90c007df > > <4>[43486.264036] Hardware name: Google Primus/Primus, BIOS > > Google_Primus.14505.145.0 06/23/2022 > > <4>[43486.264042] RIP: 0010:load_keylocker+0x0/0x7f > > But, I don't get the reason why it hit this. On the wake-up path, > copy_keylocker() is supposed to be called. Interesting, that's helpful. I thought I had a lead based on this, which was that in this case we were doing a hibernate to shutdown, rather than hibernate to S4. The IWKey backup is only valid down to S4, so a read error on resume from this type of hibernate might make sense. I know keylocker won't successfully maintain handles across a hibernate to shutdown and subsequent resume, but it shouldn't crash either. But this still doesn't explain this crash, since in this case we're still on our way down and haven't even done the shutdown yet. We can see the "PM: hibernation: Image created (1536412 pages copied)" log line just before the keylocker read failure. So then it seems something's not working with the pre-hibernate CPU hotplug path? > > I added some printout in there, and it looks to be fine with me: > > [ 218.488711] Enabling non-boot CPUs ... > [ 218.488794] x86: Booting SMP configuration: > [ 218.488795] smpboot: Booting Node 0 Processor 1 APIC 0x1 > [ 218.490634] x86/keylocker: restore processor (id=1) > [ 218.491186] CPU1 is up > ... How were you exercising the CPU onlining in this case? Boot, cpu hotplug, or hibernate? -Evan