On Mon, Dec 21, 2020 at 05:30:41PM -0500, Peter Xu wrote: > On Mon, Dec 21, 2020 at 01:49:55PM -0800, Nadav Amit wrote: > > BTW: In general, I think that you are right, and that changing of PTEs > > should not require taking mmap_lock for write. However, I am not sure > > cow_user_page() is not the only one that poses a problem and whether a more > > systematic solution is needed. If cow_user_pages() is the only problem, do > > you think it is possible to do the copying while holding the PTL? It works > > for normal-pages, but I am not sure whether special-pages pose special > > problems. > > > > Anyhow, this is an enhancement that we can try later. > > AFAIU mprotect() is the only one who modifies the pte using the mmap write > lock. NUMA balancing is also using read mmap lock when changing pte > protections, NUMA balance doesn't clear pte_write() -- I would not call setting pte_none() a change of protection. > while my understanding is mprotect() used write lock only because > it manipulates the address space itself (aka. vma layout) rather than modifying > the ptes, so it needs to. Yes, and personally, I would only take mmap lock for write when I change VMAs, not PTE protections. > At the pte level, it seems always to be the pgtable lock that serializes things. > > So it's perfectly legal to me for e.g. a driver to modify ptes with the read > lock of mmap_sem, unless I'm severely mistaken.. as long as the pgtable lock is > taken when doing so. > > If there's a driver that manipulated the ptes, changed the content of the page, > recover the ptes to origin, and all these happen right after wp_page_copy() > unlocked the pgtable lock but before wp_page_copy() retakes the same lock > again, we may face the same issue finding that the page got copied contains > corrupted data at last. While I don't know what to blame on the driver either > because it seems to be exactly following the rules. > > I believe changing into write lock would solve the race here because tlb > flushing would be guaranteed along the way, but I'm just a bit worried it's not > the best way to go.. I can't say I disagree with you but the man has made the call and I think we should just move on.