On Fri, Jul 31, 2020 at 10:19 AM Eric W. Biederman <ebiederm@xxxxxxxxxxxx> wrote: > > Even limited to opt-in locations I think the trick of being able to > transform the wait-state may solve that composition problem. So the part I found intriguing was the "catch things in the signal handling path". Catching things there - and *only* there - would avoid a lot of the problems we had with the freezer. When you're about to return to user mode, there are no lock inversions etc. And it kind of makes conceptual sense to do, since what you're trying to capture is the signal group - so using the signal state to do so seems like a natural thing to do. No touching of any runqueues or scheduler data structures, do everything _purely_ with the signal handling pathways. So that "feels" ok to me. That said, I do wonder if there are nasty nasty latency issues with odd users. Normally, you'd expect that execve() with other threads in the group shouldn't be a performance issue, because people simply shouldn't do that. So it might be ok. And if you capture them all in the signal handling pathway, that ends up being a very convenient place to zap them all too, so maybe my latency worry is misguided. IOW, I think that you could try to do your "freese other threads" not at all like the freezer, but more like a "collect all threads in their signal handler parts as the first phase of zapping them". So maybe this approach is salvageable. I see where something like the above could work well. But I say that with a lot of handwaving, and maybe if I see the patch I'd go "Christ, I was a complete idiot for ever even suggesting that". Linus