From: Guilherme G. Piccoli <gpiccoli@xxxxxxxxxx> Sent: Monday, March 4, 2024 1:43 PM > > On 04/03/2024 18:12, John Ogness wrote: > > [...] > >> The second question is how to simulate a panic context in a > >> non-destructive way, so we can test the panic notifiers in CI, without > >> crashing the machine. > > > > I'm wondering if a "fake panic" can be implemented that quiesces all the > > other CPUs via NMI (similar to kdb) and then calls the panic > > notifiers. And finally releases everything back to normal. That might > > produce a fairly realistic panic situation and should be fairly > > non-destructive (depending on what the notifiers do and how long they > > take). > > > > Hi Jocelyn / John, > > one concern here is that the panic notifiers are kind of a no man's > land, so we can have very simple / safe ones, while others are > destructive in nature. > > An example of a good behaving notifier that is destructive is the > Hyper-V one, that destroys an essential host-guest interface (called > "vmbus connection"). What happens if we trigger this one just for > testing purposes in a debugfs interface? Likely the guest would die... > > [+CCing Michael Kelley here since he seems interested in panic and is > also expert in Hyper-V, just in case my example is bogus.] The Hyper-V example is valid. After hv_panic_vmbus_unload() is called, the VM won't be able to do any disk, network, or graphics frame buffer I/O. There's no recovery short of restarting the VM. Michael [I have retired from Microsoft. I'm still occasionally contributing to Linux kernel work with email mhklinux@xxxxxxxxxxx.] > > So, maybe the problem could be split in 2: the non-notifiers portion of > the panic path, and the the notifiers; maybe restricting the notifiers > you'd run is a way to circumvent the risks, like if you could pass a > list of the specific notifiers you aim to test, this could be > interesting. Let's see what the others think and thanks for your work in > the DRM panic notifier =) > > Cheers, > > > Guilherme