On 08/13, Kirill A. Shutemov wrote: > > On Tue, Aug 13, 2019 at 04:05:53PM +0200, Oleg Nesterov wrote: > > > > > > I thought that retract_page_tables() checks vma->anon_vma to ensure that > > > this vma doesn't have a cow'ed PageAnon() page. And I still can't understand > > > why can't it race with __handle_mm_fault() paths. > > vma->anon_vma check is a cheap way to exclude MAP_PRIVATE mappings that > got written from userspace. Yes, and this is how I understood it from the very beginning, but then I was confused. > vma->anon_vma can be set up after the check but before taking mmap_sem. > But page lock would prevent establishing any new ptes of the page, so we > are safe. And this is what was not clear to me until I noticed unmap_mapping_pages() in collapse_shmem(). Plus I was confused by the comment above down_write_trylock(mmap_sem). To me it looks as if we _could_ do down_write(), but we do not want to disturb the system. But iiuc we simply can't do down_write(), exactly because handle_mm_fault() can wait for page lock with mmap_sem held. > > > Suppose that shmem_file was mmaped with PROT_READ|WRITE, MAP_PRIVATE. > > > To simplify, suppose that a non-THP page was already faulted in, > > > pte_present() == T. > > > > > > Userspace writes to this page. > > > > > > Why __handle_mm_fault()->handle_pte_fault()->do_wp_page()->wp_page_copy() > > > can not cow this page and update pte after the vma->anon_vma chech and > > > before down_write_trylock(mmap_sem) ? > > > > OK, probably this is impossible, collapse_shmem() does unmap_mapping_pages(), > > so handle_pte_fault() will call shmem_fault() which iiuc should block in > > find_lock_entry() because new_page is locked, and thus down_write_trylock() > > can't succeed. > > You've got it right. Great, thanks. > > Nevermind, I am sure I missed something. Perhaps you can update the comments > > to make this more clear. > > Let me see first that my explanation makes sense :P It does ;) Oleg.