Re: [PATCH] drop_caches: Allow unmapping pages

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On Mon, Jan 07, 2019 at 11:25:16AM -0800, Dave Hansen wrote:
> On 1/7/19 6:15 AM, Matthew Wilcox wrote:
> > You're going to get data corruption doing this.  try_to_unmap_one()
> > does:
> > 
> >          /* Move the dirty bit to the page. Now the pte is gone. */
> >          if (pte_dirty(pteval))
> >                  set_page_dirty(page);
> > 
> > so PageDirty() can be false above, but made true by calling
> > try_to_unmap().
> 
> I don't think that PageDirty() check is _required_ for correctness.  You
> can always safely try_to_unmap() no matter the state of the PTE.  We
> can't lock out the hardware from setting the Dirty bit, ever.
> 
> It's also just fine to unmap PageDirty() pages, as long as when the PTE
> is created, we move the dirty bit from the PTE into the 'struct page'
> (which try_to_unmap() does, as you noticed).

Right, but the very next thing the patch does is call
invalidate_complete_page(), which calls __remove_mapping() which ... oh,
re-checks PageDirty() and refuses to drop the page.  So this isn't the
data corruptor that I had thought it was.

> > I also think the way you've done this is expedient at the cost of
> > efficiency and layering violations.  I think you should first tear
> > down the mappings of userspace processes (which will reclaim a lot
> > of pages allocated to page tables), then you won't need to touch the
> > invalidate_inode_pages paths at all.
> 
> By "tear down the mappings", do you mean something analogous to munmap()
> where the VMA goes away?  Or madvise(MADV_DONTNEED) where the PTE is
> destroyed but the VMA remains?
> 
> Last time I checked, we only did free_pgtables() when tearing down VMAs,
> but not for pure unmappings like reclaim or MADV_DONTNEED.  I've thought
> it might be fun to make a shrinker that scanned page tables looking for
> zero'd pages, but I've never run into a system where empty page table
> pages were actually causing a noticeable problem.

A few hours ago when I thought this patch had the ordering of checking
PageDirty() the wrong way round, I had the madvise analogy in mind so
that the PTEs would get destroyed and the dirty information transferred
to the struct page first before trying to drop pages.



[Index of Archives]     [Linux Ext4 Filesystem]     [Union Filesystem]     [Filesystem Testing]     [Ceph Users]     [Ecryptfs]     [AutoFS]     [Kernel Newbies]     [Share Photos]     [Security]     [Netfilter]     [Bugtraq]     [Yosemite News]     [MIPS Linux]     [ARM Linux]     [Linux Security]     [Linux Cachefs]     [Reiser Filesystem]     [Linux RAID]     [Samba]     [Device Mapper]     [CEPH Development]

  Powered by Linux