On Thu, Jul 30, 2020 at 4:00 PM Eric W. Biederman <ebiederm@xxxxxxxxxxxx> wrote: > > The key is the function make_task_wakekill which could probably > benefit from a little more review and refinement but appears to > be basically correct. You really need to explain a lot more why you think this is all a good idea. For example, what if one of those other threads is waiting in line for a critical lock, and the wait-queue you basically disabled was the exclusive wait after lock handoff? That means that the lock will now effectively be held by that thread. No, it wasn't woken up, but it had the lock handed to it, and it's now entirely unresponsive until it is killed. How is that different from the deadlocks you're actually trying to fix? These are the kinds of problems that the freezer() code had too, with freezing things that held locks etc. This approach does seem better than the freezer thing, and if I read it right it will gather things in the signal handler code, but it's not obvious that gathering them in random places where they sleep for random reasons is safe or a good idea. I can imagine _so_ many dead systems if you just basically froze something that holds the mmap lock and is sleeping on a page fault, for example. Maybe I'm missing something, but I really think your "let's freeze things" is seriously misguided. You're concentrating on some small problem and trying to solve that, and not seeign the HUGE HONKING problems that your approach is fundamentally introducing. Linus