On Sat, Feb 4, 2023 at 10:16 PM Finn Thain <fthain@xxxxxxxxxxxxxx> wrote:
That could be a bug I was chasing back in 2021 but never found. The mmap stressors in stress-ng were triggering a crash on a Mac Quadras, though only rarely. Sometimes it would run all day without a failure. Last year when I started using GCC 12 to build the kernel, I saw the same workload fail again but the failure mode had become a silent hang/livelock instead of the oopses I got with GCC 6. When I press the NMI button after the livelock I always see do_page_fault() in the backtrace. So I've been testing your patch. I've been running the same stress-ng reproducer for about 12 hours now with no failures which looks promising. In case that stress-ng testing is of use: Tested-by: Finn Thain <fthain@xxxxxxxxxxxxxx>
Could you test the thing that Mark Rutland pointed to? He had an actual test-case for this for the arm64 fixes some years ago. See https://lore.kernel.org/all/Y9pD+TMP+%2FSyfeJm@FVFF77S0Q05N/ for his email with links to his old test-case? Linus