On Tue, Jul 28, 2020 at 4:20 AM Alex Shi <alex.shi@xxxxxxxxxxxxxxxxx> wrote: > > > > 在 2020/7/28 上午7:34, Alexander Duyck 写道: > >> @@ -1876,6 +1876,12 @@ static unsigned noinline_for_stack move_pages_to_lru(struct lruvec *lruvec, > >> * list_add(&page->lru,) > >> * list_add(&page->lru,) //corrupt > >> */ > >> + new_lruvec = mem_cgroup_page_lruvec(page, page_pgdat(page)); > >> + if (new_lruvec != lruvec) { > >> + if (lruvec) > >> + spin_unlock_irq(&lruvec->lru_lock); > >> + lruvec = lock_page_lruvec_irq(page); > >> + } > >> SetPageLRU(page); > >> > >> if (unlikely(put_page_testzero(page))) { > > I was going through the code of the entire patch set and I noticed > > these changes in move_pages_to_lru. What is the reason for adding the > > new_lruvec logic? My understanding is that we are moving the pages to > > the lruvec provided are we not?If so why do we need to add code to get > > a new lruvec? The code itself seems to stand out from the rest of the > > patch as it is introducing new code instead of replacing existing > > locking code, and it doesn't match up with the description of what > > this function is supposed to do since it changes the lruvec. > > this new_lruvec is the replacement of removed line, as following code: > >> - lruvec = mem_cgroup_page_lruvec(page, pgdat); > This recheck is for the page move the root memcg, otherwise it cause the bug: Okay, now I see where the issue is. You moved this code so now it has a different effect than it did before. You are relocking things before you needed to. Don't forget that when you came into this function you already had the lock. In addition the patch is broken as it currently stands as you aren't using similar logic in the code just above this addition if you encounter an evictable page. As a result this is really difficult to review as there are subtle bugs here. I suppose the correct fix is to get rid of this line, but it should be placed everywhere the original function was calling spin_lock_irq(). In addition I would consider changing the arguments/documentation for move_pages_to_lru. You aren't moving the pages to lruvec, so there is probably no need to pass that as an argument. Instead I would pass pgdat since that isn't going to be moving and is the only thing you actually derive based on the original lruvec.