Hi > > drawback worst aio scenario here > > ----------------------------------------------------------------------- > > io_setup() and gup inc page_count > > > > fork inc mapcount > > and make write-protect to pte > > > > write ring from userland(*) page fault and > > COW break. > > parent process get copyed page and > > child get original page owner-ship. > > > > kmap and memcpy from kernel change child page. (it mean data lost) > > > > (*) Is this happend? > > I guess it's possible, but I don't know of any programs that do this. Yup, I also think this isn't happen in real world. > > > MADV_DONTFORK or down_read(mmap_sem) or down_read(mm_pinned_sem) > > or copy-at-fork mecanism(=Nick/Andrea patch) solve it. > > OK, thanks for the explanation. > > + /* > + * aio context doesn't inherit while fork. (see mm_init()) > + * Then, aio ring also mark DONTFORK. > + */ > > Would you mind if I did some word-smithing on that comment? Something > like: > /* > * The io_context is not inherited by the child after fork() > * (see mm_init). Therefore, it makes little sense for the > * completion ring to be inherited. > */ > > + ret = sys_madvise(info->mmap_base, info->mmap_size, MADV_DONTFORK); > + BUG_ON(ret); > + > > It appears there's no other way to set the VM_DONTCOPY flag, so I guess > calling sys_madvise is fine. I'm not sure I agree with the BUG_ON(ret), > however, as EAGAIN may be feasible. > > So, fix that up and you can add my reviewed-by. I think you should push > this patch independent of the other patches in this series. Done :) -- To unsubscribe from this list: send the line "unsubscribe linux-fsdevel" in the body of a message to majordomo@xxxxxxxxxxxxxxx More majordomo info at http://vger.kernel.org/majordomo-info.html