On 8/5/23, Suren Baghdasaryan <surenb@xxxxxxxxxx> wrote: > On Fri, Aug 4, 2023 at 6:06 PM Mateusz Guzik <mjguzik@xxxxxxxxx> wrote: >> >> On 8/5/23, Linus Torvalds <torvalds@xxxxxxxxxxxxxxxxxxxx> wrote: >> > On Fri, 4 Aug 2023 at 16:25, Mateusz Guzik <mjguzik@xxxxxxxxx> wrote: >> >> >> >> I know of these guys, I think they are excluded as is -- they go >> >> through access_remote_vm, starting with: >> >> if (mmap_read_lock_killable(mm)) >> >> return 0; >> >> >> >> while dup_mmap already write locks the parent's mm. >> > >> > Oh, you're only worried about vma_start_write()? >> > >> > That's a non-issue. It doesn't take the lock normally, since it starts >> > off >> > with >> > >> > if (__is_vma_write_locked(vma, &mm_lock_seq)) >> > return; >> > >> > which catches on the lock sequence number already being set. >> > >> > So no extra locking there. >> > >> > Well, technically there's extra locking because the code stupidly >> > doesn't initialize new vma allocations to the right sequence number, >> > but that was talked about here: >> > >> > >> > https://lore.kernel.org/all/CAHk-=wiCrWAoEesBuoGoqqufvesicbGp3cX0LyKgEvsFaZNpDA@xxxxxxxxxxxxxx/ >> > >> > and it's a separate issue. >> > >> >> I'm going to bet one beer this is the issue. >> >> The patch I'm responding to only consists of adding the call to >> vma_start_write and claims the 5% slowdown from it, while fixing >> crashes if the forking process is multithreaded. >> >> For the fix to work it has to lock something against the parent. >> >> VMA_ITERATOR(old_vmi, oldmm, 0); >> [..] >> for_each_vma(old_vmi, mpnt) { >> [..] >> vma_start_write(mpnt); >> >> the added line locks an obj in the parent's vm space. >> >> The problem you linked looks like pessimization for freshly allocated >> vmas, but that's what is being operated on here. > > Sorry, now I'm having trouble understanding the problem you are > describing. We are locking the parent's vma before copying it and the > newly created vma is locked before it's added into the vma tree. What > is the problem then? > Sorry for the late reply! Looks there has been a bunch of weird talking past one another in this thread and I don't think trying to straighten it all out is worth any time. I think at least the two of us agree that if a single-threaded process enters dup_mmap an down_writes the mmap semaphore, then no new thread can pop up in said process, thus no surprise page faults from that angle. 3rd parties are supposed to interfaces like access_remote_vm, which down_read said semaphore and are consequently also not a problem. The only worry here is that someone is messing with another process memory without the semaphore, but is very unlikely and patchable in the worst case -- but someone(tm) has to audit. With all these conditions satisfied one can elide vma_start_write for a perf win. Finally, I think we agreed you are going to do the audit ;) Cheers, -- Mateusz Guzik <mjguzik gmail.com>