在 2020/1/14 上午12:34, Matthew Wilcox 写道: > On Mon, Jan 13, 2020 at 08:47:25PM +0800, Alex Shi wrote: >> 在 2020/1/13 下午5:55, Konstantin Khlebnikov 写道: >>>>> That's wrong. Here PageLRU must be checked again under lru_lock. >>>> Hi, Konstantin, >>>> >>>> For logical remain, we can get the lock and then release for !PageLRU. >>>> but I still can figure out the problem scenario. Would like to give more hints? >>> >>> That's trivial race: page could be isolated from lru between >>> >>> if (PageLRU(page)) >>> and >>> spin_lock_irq(&pgdat->lru_lock); >> >> yes, it could be a problem. guess the following change could helpful: >> I will update it in new version. > >> + if (lrucare) { >> + lruvec = lock_page_lruvec_irq(page); >> + if (likely(PageLRU(page))) { >> + ClearPageLRU(page); >> + del_page_from_lru_list(page, lruvec, page_lru(page)); >> + } else { >> + unlock_page_lruvec_irq(lruvec); >> + lruvec = NULL; >> + } > > What about a harder race to hit like a page being on LRU list A when you > look up the lruvec, then it's removed and added to LRU list B by the > time you get the lock? At that point, you are holding a lock on the > wrong LRU list. I think you need to check not just that the page > is still PageLRU but also still on the same LRU list. > Thanks for comments, Matthew! We will check and lock lruvec after lock_page_memcg, so if it works well, a page won't moved from one lruvec to another. Also the later debug do this check, to see if lruvec changed. If you mean lru list not lruvec, It seems there is noway to figure out the lru list from a page now. Thanks Alex