Re: [PATCH 1/1] mm: disable CONFIG_PER_VMA_LOCK by default until its fixed

David Hildenbrand <david@xxxxxxxxxx> · Tue, 4 Jul 2023 20:05:01 +0200

On 04.07.23 19:56, Suren Baghdasaryan wrote:
On Tue, Jul 4, 2023 at 10:36 AM David Hildenbrand <david@xxxxxxxxxx> wrote:

On 04.07.23 19:21, Suren Baghdasaryan wrote:
On Tue, Jul 4, 2023 at 6:07 AM Matthew Wilcox <willy@xxxxxxxxxxxxx> wrote:

On Tue, Jul 04, 2023 at 09:18:18AM +0200, David Hildenbrand wrote:
At least the reproducer at
https://bugzilla.kernel.org/show_bug.cgi?id=217624 is working now. But
I wonder if that's the best way to fix this. It's surely simple but
locking every VMA is not free and doing that on every fork might
regress performance.

That would mean that we can possibly still get page faults concurrent to
fork(), on the yet unprocessed part. While that fixes the issue at hand, I
cannot reliably tell if this doesn't mess with some other fork() corner
case.

I'd suggest write-locking all VMAs upfront, before doing any kind of fork-mm
operation. Just like the old code did. See below.

Calling fork() from a multi-threaded program is fraught with danger.
It's a rare thing to do, and we don't need to optimise for it.  It
does, of course, need to not crash.  But we can slow it down as much as
we want to.  Slowing down single-threaded programs calling fork is
much less acceptable.

Hmm. Would you suggest we use different approaches for multi-threaded
vs single-threaded programs?
I think locking VMAs while forking a process which has lots of VMAs
will regress by some amount (we are adding non-zero work). The
question is if that's acceptable or we have to implement something
different. I verified that solution fixes the issue shown by the
reproducer, now I'm trying to quantify this fork performance
regression I suspect we will introduce.

Well, the design decision that CONFIG_PER_VMA_LOCK made for now to make
page faults fast and to make blocking any page faults from happening to
be slower (unless there is some easy way that's already built in).

So it wouldn't surprise me if it might affect performance a bit, but
it's to be quantified if it really matters in comparison to all the page
table copying and other stuff we do during fork.

Maybe that can be optimized/sped up later. But for now we should fix
this the straightforward way. That fix will be (and has to be) a NOP for
!CONFIG_PER_VMA_LOCK, so that one won't be affected.

Maybe this patch in an adjusted form would still make sense (also to be
backported), to keep the feature inactive as default until it stabilized
a bit more.

Ok, IIUC your suggestion is to use VMA-lock-on-fork fix even if the
fork() regresses and keep CONFIG_PER_VMA_LOCK disabled by default
until it's more stable. That sounds good to me. With that fix, do we
still need to add the BROKEN dependency? I'm guessing it would be
safer to disable for sure.

With this fixed, I don't think we need a BROKEN dependency.

I'll let you decide if you want to keep it enabled as default, I'd 
rather disable it for one release and enable it as default later.

Happy so learn if taking all VMA locks without any contention causes a 
lot of harm.

--
Cheers,

David / dhildenb