On Mon, Sep 14, 2020 at 03:59:31PM -0700, Linus Torvalds wrote: > On Mon, Sep 14, 2020 at 3:55 PM Jason Gunthorpe <jgg@xxxxxxxxxx> wrote: > > > > Just as an aside, the RDMA stuff is also supposed to set MADV_DONTFORK > > on these regions, so I'm a bit puzzled what is happening here. > > Did the fork perhaps happen _before_ , so the pages are shared when > you do the pin? Looking at the progam, it seems there are a number of forks for exec before and after pin_user_pages_fast(), but the parent process always does waitpid() after the fork. > MADV_DONTFORK doesn't mean COW doesn't happen. It just means that the > next fork() won't be copying that memory area. Yes, this stuff does pin_user_pages_fast() and MADV_DONTFORK together. It sets FOLL_FORCE and FOLL_WRITE to get an exclusive copy of the page and MADV_DONTFORK was needed to ensure that a future fork doesn't establish a COW that would break the DMA by moving the physical page over to the fork. DMA should stay with the process that called pin_user_pages_fast() (Is MADV_DONTFORK still needed with recent years work to GUP/etc? It is a pretty terrible ancient thing) > That said, it's possible that the test cases do something invalid - or > maybe we've broken MADV_DONTFORK - and it all just happened to work > before. Hmm. If symptoms stop with this patch should we investigate MADV_DONTFORK? Thanks, Jason