Re: Dirty/Access bits vs. page content

Peter Zijlstra <peterz@xxxxxxxxxxxxx> · Thu, 24 Apr 2014 08:51:33 +0200

On Wed, Apr 23, 2014 at 12:33:15PM -0700, Linus Torvalds wrote:
> On Wed, Apr 23, 2014 at 11:41 AM, Jan Kara <jack@xxxxxxx> wrote:
> >
> > Now I'm not sure how to fix Linus' patches. For all I care we could just
> > rip out pte dirty bit handling for file mappings. However last time I
> > suggested this you corrected me that tmpfs & ramfs need this. I assume this
> > is still the case - however, given we unconditionally mark the page dirty
> > for write faults, where exactly do we need this?
> 
> Honza, you're missing the important part: it does not matter one whit
> that we unconditionally mark the page dirty, when we do it *early*,
> and it can be then be marked clean before it's actually clean!
> 
> The problem is that page cleaning can clean the page when there are
> still writers dirtying the page. Page table tear-down removes the
> entry from the page tables, but it's still there in the TLB on other
> CPU's. 

> So other CPU's are possibly writing to the page, when
> clear_page_dirty_for_io() has marked it clean (because it didn't see
> the page table entries that got torn down, and it hasn't seen the
> dirty bit in the page yet).

So page_mkclean() does an rmap walk to mark the page RO, it does mm wide
TLB invalidations while doing so.

zap_pte_range() only removes the rmap entry after it does the
ptep_get_and_clear_full(), which doesn't do any TLB invalidates.

So as Linus says the page_mkclean() can actually miss a page, because
it's already removed from rmap() but due to zap_pte_range() can still
have active TLB entries.

So in order to fix this we'd have to delay the page_remove_rmap() as
well, but that's tricky because rmap walkers cannot deal with
in-progress teardown. Will need to think on this.

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@xxxxxxxxx.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@xxxxxxxxx";> email@xxxxxxxxx </a>