On Wed, May 10, 2023 at 08:28:17AM -0700, Doug Anderson wrote: > Hi, Hi Doug, > On Wed, Apr 19, 2023 at 3:57 PM Douglas Anderson <dianders@xxxxxxxxxxxx> wrote: > > This is an attempt to resurrect Sumit's old patch series [1] that > > allowed us to use the arm64 pseudo-NMI to get backtraces of CPUs and > > also to round up CPUs in kdb/kgdb. The last post from Sumit that I > > could find was v7, so I called this series v8. I haven't copied all of > > his old changelongs here, but you can find them from the link. > > > > Since v7, I have: > > * Addressed the small amount of feedback that was there for v7. > > * Rebased. > > * Added a new patch that prevents us from spamming the logs with idle > > tasks. > > * Added an extra patch to gracefully fall back to regular IPIs if > > pseudo-NMIs aren't there. > > > > Since there appear to be a few different patches series related to > > being able to use NMIs to get stack traces of crashed systems, let me > > try to organize them to the best of my understanding: > > > > a) This series. On its own, a) will (among other things) enable stack > > traces of all running processes with the soft lockup detector if > > you've enabled the sysctl "kernel.softlockup_all_cpu_backtrace". On > > its own, a) doesn't give a hard lockup detector. > > > > b) A different recently-posted series [2] that adds a hard lockup > > detector based on perf. On its own, b) gives a stack crawl of the > > locked up CPU but no stack crawls of other CPUs (even if they're > > locked too). Together with a) + b) we get everything (full lockup > > detect, full ability to get stack crawls). > > > > c) The old Android "buddy" hard lockup detector [3] that I'm > > considering trying to upstream. If b) lands then I believe c) would > > be redundant (at least for arm64). c) on its own is really only > > useful on arm64 for platforms that can print CPU_DBGPCSR somehow > > (see [4]). a) + c) is roughly as good as a) + b). > It's been 3 weeks and I haven't heard a peep on this series. That > means nobody has any objections and it's all good to land, right? > Right? :-P FWIW, there are still longstanding soundness issues in the arm64 pseudo-NMI support (and fixing that requires an overhaul of our DAIF / IRQ flag management, which I've been chipping away at for a number of releases), so I hadn't looked at this in detail yet because the foundations are still somewhat dodgy. I appreciate that this has been around for a while, and it's on my queue to look at. Thanks, Mark.