On Tue, Jul 4, 2023 at 6:07 AM Matthew Wilcox <willy@xxxxxxxxxxxxx> wrote: > > On Tue, Jul 04, 2023 at 09:18:18AM +0200, David Hildenbrand wrote: > > > At least the reproducer at > > > https://bugzilla.kernel.org/show_bug.cgi?id=217624 is working now. But > > > I wonder if that's the best way to fix this. It's surely simple but > > > locking every VMA is not free and doing that on every fork might > > > regress performance. > > > > > > That would mean that we can possibly still get page faults concurrent to > > fork(), on the yet unprocessed part. While that fixes the issue at hand, I > > cannot reliably tell if this doesn't mess with some other fork() corner > > case. > > > > I'd suggest write-locking all VMAs upfront, before doing any kind of fork-mm > > operation. Just like the old code did. See below. > > Calling fork() from a multi-threaded program is fraught with danger. > It's a rare thing to do, and we don't need to optimise for it. It > does, of course, need to not crash. But we can slow it down as much as > we want to. Slowing down single-threaded programs calling fork is > much less acceptable. Hmm. Would you suggest we use different approaches for multi-threaded vs single-threaded programs? I think locking VMAs while forking a process which has lots of VMAs will regress by some amount (we are adding non-zero work). The question is if that's acceptable or we have to implement something different. I verified that solution fixes the issue shown by the reproducer, now I'm trying to quantify this fork performance regression I suspect we will introduce. > > https://pubs.opengroup.org/onlinepubs/9699919799/functions/fork.html