On 2021/7/15 3:43, John Hubbard wrote: > On 7/14/21 4:48 AM, Matthew Wilcox wrote: >> On Wed, Jul 14, 2021 at 07:36:57PM +0800, Miaohe Lin wrote: >>> On 2021/7/13 21:34, Matthew Wilcox wrote: >>>> On Tue, Jul 13, 2021 at 09:13:51PM +0800, Miaohe Lin wrote: >>>>>>> When the MADV_FREE pages are redirtied before they could be reclaimed, the pages >>>>>>> should be put back to anonymous LRU list by setting SwapBacked flag, thus the >>>>>>> pages will be reclaimed in normal swapout way. >>>>>> >>>>>> Agreed. But the question is why this needs an explicit handling here >>>>>> when we already do handle this case when trying to unmap the page. >>>>> >>>>> This makes me think more. It seems even the page_ref_freeze call is guaranteed to >>>>> success as no one can grab the page refcnt after the page is successfully unmapped. >>>> >>>> NO! This is wrong. Every page can have its refcount speculatively raised >>>> (and then lowered). The two prime candidates for this are lockless GUP >>>> and page cache lookups, but there can be others too. >>>> >>> >>> Many thanks for pointing this out. My overlook! Sorry! >>> So, it seems lockless GUP can redirty the MADV_FREE page. But is it ok to just release >>> a redirtied MADV_FREE pages? Because we hold the last reference here and the page will >>> be freed anyway... >> >> I don't see how lockless GUP can redirty the page. It can grab the >> refcount, thus making the refcount here two. Then the call to freeze >> here fails and the page stays on the list. But the lockless GUP checks >> the page is still in the page table (and discovers it isn't, so releases >> the reference count). Am I missing a path that lets lockless GUP dirty >> the page? >> > > If a device driver pins some pages using gup, and the device then uses dma > to write to those pages, then you could get there. That story is part of the > reasoning that led to creating pin_user_pages(), which btw does not yet > fully solve that case. Many thanks for your explanation. So the similar scenario that is clarified in the __remove_mapping() is possible: get_user_pages(&page); [user mapping goes away] write_to(page); !PageDirty(page) [good] SetPageDirty(page); put_page(page); !page_count(page) [good, discard it] [oops, our write_to data is lost] The page can be redirtied after the page is unmapped. And there is no way to restore the page table as clean MADV_FREE page is simply cleared from page table via the try_to_unmap path. Is it ok to just release the redirtied MADV_FREE pages here as we hold the last reference and the page will be freed anyway... ? > > Basically, though, unless a non-CPU device has access to the page, it's > hard to see how gup itself can lead to a page getting dirtied. > > thanks,