On Wed, 8 Sep 2021, Michael Schmitz wrote:
In a related case, I've managed to swap my 'resume_userspace' format error for a nice 'illegal instruction' format error apparently caused by an invalid function pointer in __handle_irq_event_percpu(), just by disabling all interrupts upon entering the auto_inthandler and user_inthandler exception handlers. This bug is quite readily reproduced by running your kernel_coverage.sh script in a loop (panics on the first stress test on the second pass): Stress run 2 Logging to stress-ng-20210908-0838.log ./kernel-coverage.sh: line 272: lcov: command not found running --fork 1 --fork-vm -t 60 --timestamp --no-rand-seed --times stress-ng: 08:40:08.70 info: [1914] setting to a 60 second run per stressor stress-ng: 08:40:08.82 info: [1914] dispatching hogs: 1 fork packet_write_wait: Connection to 10.1.1.4 port 22: Broken pipe Why disabling interrupts during interrupt processing would make matters worse doesn't make any sense to me...
Are you able to reproduce that with a stock mainline build?