On Sat, 16 Jul 2022 01:27:31 +0000 Shakeel Butt <shakeelb@xxxxxxxxxx> wrote: > > ... > > > > > production with real workloads and it has caused hard lockups. > > > > Particularly network heavy workloads with a lot of threads in > > > > epoll_wait() can easily trigger this issue if they get killed > > > > (oom-killed in our case). > > > > > > Hard lockups are undesirable. Is a cc:stable justified here? > > > > Not for now as I don't know if we can blame a patch which might be the > > source of this behavior. > > I am able to repro the epoll hard lockup on next-20220715 with Ben's > patch reverted. The repro is a simple TCP server and tens of clients > communicating over loopback. Though to cause the hard lockup I have to > create a couple thousand threads in epoll_wait() in server and also > reduce the kernel.watchdog_thresh. With Ben's patch the repro does not > cause the hard lockup even with kernel.watchdog.thresh=1. > > Please add: > > Tested-by: Shakeel Butt <shakeelb@xxxxxxxxxx> OK, thanks. I added the cc:stable. No Fixes:, as it has presumably been there for a long time, perhaps for all time.