On Tue, Jan 4, 2022 at 8:06 PM Minchan Kim <minchan@xxxxxxxxxx> wrote: > > On Tue, Jan 04, 2022 at 08:46:17AM -0300, Mauricio Faria de Oliveira wrote: > > On Thu, Dec 16, 2021 at 3:17 PM Minchan Kim <minchan@xxxxxxxxxx> wrote: > > ... > > > Hi Mauricio, > > > > > > Thanks for catching the bug. There is some comment before I would > > > look the problem in more detail. Please see below. > > > > > > > Hey! Thanks for looking into this. Sorry for the delay; I've been out > > a few weeks. > > > > > > diff --git a/mm/rmap.c b/mm/rmap.c > > > > index 163ac4e6bcee..f04151aae03b 100644 > > > > --- a/mm/rmap.c > > > > +++ b/mm/rmap.c > > > > @@ -1570,7 +1570,18 @@ static bool try_to_unmap_one(struct page *page, struct vm_area_struct *vma, > > > > > > > > /* MADV_FREE page check */ > > > > if (!PageSwapBacked(page)) { > > > > - if (!PageDirty(page)) { > > > > + int refcount = page_ref_count(page); > > > > + > > > > + /* > > > > + * The only page refs must be from the isolation > > > > + * (checked by the caller shrink_page_list() too) > > > > + * and the (single) rmap (dropped by discard:). > > > > + * > > > > + * Check the reference count before dirty flag > > > > + * with memory barrier; see __remove_mapping(). > > > > + */ > > > > + smp_rmb(); > > > > + if (refcount == 2 && !PageDirty(page)) { > > > > > > A madv_free marked page could be mapped at several processes so > > > it wouldn't be refcount two all the time, I think. > > > Shouldn't we check it with page_mapcount with page_refcount? > > > > > > page_ref_count(page) - 1 > page_mapcount(page) > > > > > > > It's the other way around, isn't it? The madvise(MADV_FREE) call only clears > > the page dirty flag if page_mapcount() == 1 (ie not mapped by more processes). > > > > @ madvise_free_pte_range() > > > > /* > > * If page is shared with others, we couldn't clear > > * PG_dirty of the page. > > */ > > if (page_mapcount(page) != 1) { > > unlock_page(page); > > continue; > > } > > ... > > ClearPageDirty(page); > > unlock_page(page); > > > > > > If that's right, the refcount of 2 should be OK (one from the isolation, > > another one from the single map/one process.) > > > > Does that make sense? I might be missing something. > > A child process could be forked from parent after madvise so the page > mapcount of the hinted page could have elevated refcount. Thanks for clarifying. I was indeed missing something. :) I revisited your commit 854e9ed09ded ("mm: support madvise(MADV_FREE)"), and now realized that the restriction for map count == 1 is only for _cleaning_ the dirty flag, not for discarding the (clean) page mapping later on. """ To solve the problem, this patch clears PG_dirty if only the page is owned exclusively by current process when madvise is called because PG_dirty represents ptes's dirtiness in several processes so we could clear it only if we own it exclusively. """ I'll look at this tomorrow and submit a v2. Thanks! -- Mauricio Faria de Oliveira