On Tue, Sep 26, 2017 at 03:40:17PM -0400, Johannes Weiner wrote: > On Tue, Sep 26, 2017 at 10:26:26AM -0700, Shaohua Li wrote: > > From: Shaohua Li <shli@xxxxxx> > > > > MADV_FREE clears pte dirty bit and then marks the page lazyfree (clear > > SwapBacked). There is no lock to prevent the page is added to swap cache > > between these two steps by page reclaim. If page reclaim finds such > > page, it will simply add the page to swap cache without pageout the page > > to swap because the page is marked as clean. Next time, page fault will > > read data from the swap slot which doesn't have the original data, so we > > have a data corruption. To fix issue, we mark the page dirty and pageout > > the page. > > Reclaim and MADV_FREE hold the page lock when manipulating the dirty > and the swapcache state. > > Instead of undoing a racing MADV_FREE in reclaim, wouldn't it be safe > to check the dirty bit before add_to_swap() and skip clean pages? That would work, but I don't see an easy/clean way to check the dirty bit. Since the race is rare, I think this optimiztion isn't worthy. Thanks, Shaohua