On Wed, Aug 9, 2023 at 2:07 PM Mateusz Guzik <mjguzik@xxxxxxxxx> wrote: > > On 8/5/23, Suren Baghdasaryan <surenb@xxxxxxxxxx> wrote: > > On Fri, Aug 4, 2023 at 6:06 PM Mateusz Guzik <mjguzik@xxxxxxxxx> wrote: > >> > >> On 8/5/23, Linus Torvalds <torvalds@xxxxxxxxxxxxxxxxxxxx> wrote: > >> > On Fri, 4 Aug 2023 at 16:25, Mateusz Guzik <mjguzik@xxxxxxxxx> wrote: > >> >> > >> >> I know of these guys, I think they are excluded as is -- they go > >> >> through access_remote_vm, starting with: > >> >> if (mmap_read_lock_killable(mm)) > >> >> return 0; > >> >> > >> >> while dup_mmap already write locks the parent's mm. > >> > > >> > Oh, you're only worried about vma_start_write()? > >> > > >> > That's a non-issue. It doesn't take the lock normally, since it starts > >> > off > >> > with > >> > > >> > if (__is_vma_write_locked(vma, &mm_lock_seq)) > >> > return; > >> > > >> > which catches on the lock sequence number already being set. > >> > > >> > So no extra locking there. > >> > > >> > Well, technically there's extra locking because the code stupidly > >> > doesn't initialize new vma allocations to the right sequence number, > >> > but that was talked about here: > >> > > >> > > >> > https://lore.kernel.org/all/CAHk-=wiCrWAoEesBuoGoqqufvesicbGp3cX0LyKgEvsFaZNpDA@xxxxxxxxxxxxxx/ > >> > > >> > and it's a separate issue. > >> > > >> > >> I'm going to bet one beer this is the issue. > >> > >> The patch I'm responding to only consists of adding the call to > >> vma_start_write and claims the 5% slowdown from it, while fixing > >> crashes if the forking process is multithreaded. > >> > >> For the fix to work it has to lock something against the parent. > >> > >> VMA_ITERATOR(old_vmi, oldmm, 0); > >> [..] > >> for_each_vma(old_vmi, mpnt) { > >> [..] > >> vma_start_write(mpnt); > >> > >> the added line locks an obj in the parent's vm space. > >> > >> The problem you linked looks like pessimization for freshly allocated > >> vmas, but that's what is being operated on here. > > > > Sorry, now I'm having trouble understanding the problem you are > > describing. We are locking the parent's vma before copying it and the > > newly created vma is locked before it's added into the vma tree. What > > is the problem then? > > > > Sorry for the late reply! > > Looks there has been a bunch of weird talking past one another in this > thread and I don't think trying to straighten it all out is worth any > time. > > I think at least the two of us agree that if a single-threaded process > enters dup_mmap an > down_writes the mmap semaphore, then no new thread can pop up in said > process, thus no surprise page faults from that angle. 3rd parties are > supposed to interfaces like access_remote_vm, which down_read said > semaphore and are consequently also not a problem. The only worry here > is that someone is messing with another process memory without the > semaphore, but is very unlikely and patchable in the worst case -- but > someone(tm) has to audit. With all these conditions satisfied one can > elide vma_start_write for a perf win. > > Finally, I think we agreed you are going to do the audit ;) Ack. I'll look into this once the dust settles. Thanks! > > Cheers, > -- > Mateusz Guzik <mjguzik gmail.com>