Re: Dirty/Access bits vs. page content

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On Sun, 27 Apr 2014, Hugh Dickins wrote:
> 
> But woke with a panic attack that we have overlooked the question
> of how page reclaim's page_mapped() checks are serialized.
> Perhaps this concern will evaporate with the morning dew,
> perhaps it will not...

It was a real concern, but we happen to be rescued by the innocuous-
looking is_page_cache_freeable() check at the beginning of pageout():
which will deserve its own comment, but that can follow later.

My concern was with page reclaim's shrink_page_list() racing against
munmap's or exit's (or madvise's) zap_pte_range() unmapping the page.

Once zap_pte_range() has cleared the pte from a vma, neither
try_to_unmap() nor page_mkclean() will see that vma as containing
the page, so neither will do its own flush TLB of the cpus involved,
before proceeding to writepage.

Linus's patch (serialializing with ptlock) or my patch (serializing
with i_mmap_mutex) both almost fix that, but it seemed not entirely:
because try_to_unmap() is only called when page_mapped(), and
page_mkclean() quits early without taking locks when !page_mapped().

So in the interval when zap_pte_range() has brought page_mapcount()
down to 0, but not yet flushed TLB on all mapping cpus, it looked as
if we still had a problem - neither try_to_unmap() nor page_mkclean()
would take the lock either of us rely upon for serialization.

But pageout()'s preliminary is_page_cache_freeable() check makes
it safe in the end: although page_mapcount() has gone down to 0,
page_count() remains raised until the free_pages_and_swap_cache()
after the TLB flush.

So I now believe we're safe after all with either patch, and happy
for Linus to go ahead with his.

Peter, returning at last to your question of whether we could exempt
shmem from the added overhead of either patch.  Until just now I
thought not, because of the possibility that the shmem_writepage()
could occur while one of the mm's cpus remote from zap_pte_range()
cpu was still modifying the page.  But now that I see the role
played by is_page_cache_freeable(), and of course the zapping end
has never dropped its reference on the page before the TLB flush,
however late that occurred, hmmm, maybe yes, shmem can be exempted.

But I'd prefer to dwell on that a bit longer: we can add that as
an optimization later if it holds up to scrutiny.

Hugh
--
To unsubscribe from this list: send the line "unsubscribe linux-arch" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at  http://vger.kernel.org/majordomo-info.html




[Index of Archives]     [Linux Kernel]     [Kernel Newbies]     [x86 Platform Driver]     [Netdev]     [Linux Wireless]     [Netfilter]     [Bugtraq]     [Linux Filesystems]     [Yosemite Discussion]     [MIPS Linux]     [ARM Linux]     [Linux Security]     [Linux RAID]     [Samba]     [Device Mapper]

  Powered by Linux