On Fri, 2024-12-13 at 14:23 +0100, Thomas Gleixner wrote: > On Fri, Dec 13 2024 at 19:48, Ming Lei wrote: > > On Fri, Dec 13, 2024 at 12:31:24PM +0100, Thomas Gleixner wrote: > > > I'd rather say, that's a kexec problem. On the same instance a loop test > > > of suspend to ram with pm_test=core just works fine. That's equivalent > > > to the kexec scenario. It goes down to syscore_suspend() and skips the > > > actual suspend low level magic. It then resumes with syscore_resume() > > > and brings the machine back up. > > > > > > That runs for 2 hours now, while the kexec muck dies within 2 > > > minutes.... > > > > > > And if you look at the difference of these implementations, you might > > > notice that kexec just implemented some rudimentary version of the > > > actual suspend logic. Based on let's hope it works that way. > > > > > > This is just insane and should be rewritten to actually reuse the suspend > > > mechanism, which is way better tested than this kexec jump muck. > > > > But kexec is supposed to align with reboot/shutdown, instead of suspend, > > and it is calling ->shutdown() for notifying driver & device. > > That's only true for the case where the new kernel takes over. > > In the case KEXEC_JUMP=n and kexec_image->preserve_context == true, then > it is supposed to align with suspend/resume and if you look at the code > then it actually mimics suspend/resume in the most dilettanteish way. Did you mean KEXEC_JUMP=y there? I spent a while the other week trying to understand the case where CONFIG_KEXEC_JUMP=n and kexec_image->preserve_context=true, and came to the conclusion that it was a mirage. Userspace can't *actually* set the KEXEC_PRESERVE_CONTEXT bit when setting up the image, if KEXEC_JUMP=n. The whole of the code path for that case is dead code. It's confusing because as discussed elsewhere, we don't just #ifdef out the whole of that dead code path, but only the bits which don't actually *compile* (like references to restore_processor_state() etc.). > It's a patently bad idea to clobber the kernel with kexec jump "fixes" > instead of using the well tested and established suspend/resume > machinery. > > All it takes is to: > > 1) disable the wakeup logic > > 2) provide a mechanism to invoke machine_kexec() instead of the > actual suspend mechanism. > > No? Agreed. The hacky proof of concept I posted earlier invoking machine_kexec() instead of suspend_ops->enter() works fine. I'll look at cleaning it up and making it not invoke all the ACPI hooks for *actual* suspend to RAM, etc. As I noted though, it doesn't address that linux-scsi report which was a *real* kexec, not a kjump.
Attachment:
smime.p7s
Description: S/MIME cryptographic signature