Re: [PATCH v2] mm: fix race between MADV_FREE reclaim and blkdev direct IO read

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On Tue, Jan 11, 2022 at 11:29:36AM -0800, John Hubbard wrote:
> On 1/11/22 10:54, Minchan Kim wrote:
> ...
> > Hi Yu,
> > 
> > I think you're correct. I think we don't like memory barrier
> > there in page_dup_rmap. Then, how about make gup_fast is aware
> > of FOLL_TOUCH?
> > 
> > FOLL_TOUCH means it's going to write something so the page
> 
> Actually, my understanding of FOLL_TOUCH is that it does *not* mean that
> data will be written to the page. That is what FOLL_WRITE is for.
> FOLL_TOUCH means: update the "accessed" metadata, without actually
> writing to the memory that the page represents.

Exactly. I should have mentioned the FOLL_TOUCH with FOLL_WRITE.
What I wanted to hit with FOLL_TOUCH was 

follow_page_pte:

        if (flags & FOLL_TOUCH) {
                if ((flags & FOLL_WRITE) &&
                    !pte_dirty(pte) && !PageDirty(page))
                        set_page_dirty(page);
                mark_page_accessed(page);
        }

> 
> 
> > should be dirty. Currently, get_user_pages works like that.
> > Howver, problem is get_user_pages_fast since it looks like
> > that lockless_pages_from_mm doesn't support FOLL_TOUCH.
> > 
> > So the idea is if param in internal_get_user_pages_fast
> > includes FOLL_TOUCH, gup_{pmd,pte}_range try to make the
> > page dirty under trylock_page(If the lock fails, it goes
> 
> Marking a page dirty solely because FOLL_TOUCH is specified would
> be an API-level mistake. That's why it isn't "supported". Or at least,
> that's how I'm reading things.
> 
> Hope that helps!
> 
> > slow path with __gup_longterm_unlocked and set_dirty_pages
> > for them).
> > 
> > This approach would solve other cases where map userspace
> > pages into kernel space and then write. Since the write
> > didn't go through with the process's page table, we will
> > lose the dirty bit in the page table of the process and
> > it turns out same problem. That's why I'd like to approach
> > this.
> > 
> > If it doesn't work, the other option to fix this specific
> > case is can't we make pages dirty in advance in DIO read-case?
> > 
> > When I look at DIO code, it's already doing in async case.
> > Could't we do the same thing for the other cases?
> > I guess the worst case we will see would be more page
> > writeback since the page becomes dirty unnecessary.
> 
> Marking pages dirty after pinning them is a pre-existing area of
> problems. See the long-running LWN articles about get_user_pages() [1].

Oh, Do you mean marking page dirty in DIO path is already problems?
Let me read the pages in the link.
Thanks!

> 
> 
> [1] https://lwn.net/Kernel/Index/#Memory_management-get_user_pages
> 
> thanks,
> -- 
> John Hubbard
> NVIDIA
> 



[Index of Archives]     [Linux RAID]     [Linux SCSI]     [Linux ATA RAID]     [IDE]     [Linux Wireless]     [Linux Kernel]     [ATH6KL]     [Linux Bluetooth]     [Linux Netdev]     [Kernel Newbies]     [Security]     [Git]     [Netfilter]     [Bugtraq]     [Yosemite News]     [MIPS Linux]     [ARM Linux]     [Linux Security]     [Device Mapper]

  Powered by Linux