Re: [PATCH] mm: fix race between MADV_FREE reclaim and blkdev direct IO read

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On Tue, Jan 04, 2022 at 08:46:17AM -0300, Mauricio Faria de Oliveira wrote:
> On Thu, Dec 16, 2021 at 3:17 PM Minchan Kim <minchan@xxxxxxxxxx> wrote:
> ...
> > Hi Mauricio,
> >
> > Thanks for catching the bug. There is some comment before I would
> > look the problem in more detail. Please see below.
> >
> 
> Hey! Thanks for looking into this. Sorry for the delay; I've been out
> a few weeks.
> 
> > > diff --git a/mm/rmap.c b/mm/rmap.c
> > > index 163ac4e6bcee..f04151aae03b 100644
> > > --- a/mm/rmap.c
> > > +++ b/mm/rmap.c
> > > @@ -1570,7 +1570,18 @@ static bool try_to_unmap_one(struct page *page, struct vm_area_struct *vma,
> > >
> > >                       /* MADV_FREE page check */
> > >                       if (!PageSwapBacked(page)) {
> > > -                             if (!PageDirty(page)) {
> > > +                             int refcount = page_ref_count(page);
> > > +
> > > +                             /*
> > > +                              * The only page refs must be from the isolation
> > > +                              * (checked by the caller shrink_page_list() too)
> > > +                              * and the (single) rmap (dropped by discard:).
> > > +                              *
> > > +                              * Check the reference count before dirty flag
> > > +                              * with memory barrier; see __remove_mapping().
> > > +                              */
> > > +                             smp_rmb();
> > > +                             if (refcount == 2 && !PageDirty(page)) {
> >
> > A madv_free marked page could be mapped at several processes so
> > it wouldn't be refcount two all the time, I think.
> > Shouldn't we check it with page_mapcount with page_refcount?
> >
> >     page_ref_count(page) - 1  > page_mapcount(page)
> >
> 
> It's the other way around, isn't it? The madvise(MADV_FREE) call only clears
> the page dirty flag if page_mapcount() == 1 (ie not mapped by more processes).
> 
> @ madvise_free_pte_range()
> 
>                         /*
>                          * If page is shared with others, we couldn't clear
>                          * PG_dirty of the page.
>                          */
>                         if (page_mapcount(page) != 1) {
>                                 unlock_page(page);
>                                 continue;
>                         }
> ...
>                         ClearPageDirty(page);
>                         unlock_page(page);
> 
> 
> If that's right, the refcount of 2 should be OK (one from the isolation,
> another one from the single map/one process.)
> 
> Does that make sense?  I might be missing something.

A child process could be forked from parent after madvise so the page
mapcount of the hinted page could have elevated refcount.



[Index of Archives]     [Linux RAID]     [Linux SCSI]     [Linux ATA RAID]     [IDE]     [Linux Wireless]     [Linux Kernel]     [ATH6KL]     [Linux Bluetooth]     [Linux Netdev]     [Kernel Newbies]     [Security]     [Git]     [Netfilter]     [Bugtraq]     [Yosemite News]     [MIPS Linux]     [ARM Linux]     [Linux Security]     [Device Mapper]

  Powered by Linux