On 8/5/23, Suren Baghdasaryan <surenb@xxxxxxxxxx> wrote: > On Fri, Aug 4, 2023 at 5:26 PM Suren Baghdasaryan <surenb@xxxxxxxxxx> > wrote: >> >> On Fri, Aug 4, 2023 at 5:15 PM Linus Torvalds >> <torvalds@xxxxxxxxxxxxxxxxxxxx> wrote: >> > >> > On Fri, 4 Aug 2023 at 16:25, Mateusz Guzik <mjguzik@xxxxxxxxx> wrote: >> > > >> > > I know of these guys, I think they are excluded as is -- they go >> > > through access_remote_vm, starting with: >> > > if (mmap_read_lock_killable(mm)) >> > > return 0; >> > > >> > > while dup_mmap already write locks the parent's mm. >> > >> > Oh, you're only worried about vma_start_write()? >> > >> > That's a non-issue. It doesn't take the lock normally, since it starts >> > off with >> > >> > if (__is_vma_write_locked(vma, &mm_lock_seq)) >> > return; >> > >> > which catches on the lock sequence number already being set. >> >> That check will prevent re-locking but if vma is not already locked >> then the call will proceed with obtaining the lock and setting >> vma->vm_lock_seq to mm->mm_lock_seq. > > The optimization Mateusz describes looks valid to me. If there is > nobody else to fault a page and mm_users is stable (which I think it > is because we are holding mmap_lock for write) then we can skip vma > locking, I think. > mm_users is definitely *not* stable -- it can be bumped by get_task_mm, which is only synchronized with task lock. However, the other users (that I know of ) go through the mmap semaphore to mess with anything which means they will wait for dup_mmap to finish (or do their work first). I would be surprised if there were any cases which don't take the semaphore, given that it was a requirement prior to the vma patchset (unless you patched some to no longer need it?). I would guess worst case the semaphore can be added if missing. What is guaranteed is that if the forking process is single-threaded, there will be no threads added out of nowhere -- the only thread which could do it is busy creating one in dup_mmap. If multithreaded operation of the forking process was the only problem, that's it. >> >> > >> > So no extra locking there. >> > >> > Well, technically there's extra locking because the code stupidly >> > doesn't initialize new vma allocations to the right sequence number, >> > but that was talked about here: >> > >> > >> > https://lore.kernel.org/all/CAHk-=wiCrWAoEesBuoGoqqufvesicbGp3cX0LyKgEvsFaZNpDA@xxxxxxxxxxxxxx/ >> > >> > and it's a separate issue. >> > >> > Linus > -- Mateusz Guzik <mjguzik gmail.com>