在 2019/11/19 上午12:11, Johannes Weiner 写道: > On Sat, Nov 16, 2019 at 11:15:02AM +0800, Alex Shi wrote: >> diff --git a/mm/memcontrol.c b/mm/memcontrol.c >> index 62470325f9bc..cf274739e619 100644 >> --- a/mm/memcontrol.c >> +++ b/mm/memcontrol.c >> @@ -1246,6 +1246,42 @@ struct lruvec *mem_cgroup_page_lruvec(struct page *page, struct pglist_data *pgd >> return lruvec; >> } >> >> +struct lruvec *lock_page_lruvec_irq(struct page *page, >> + struct pglist_data *pgdat) >> +{ >> + struct lruvec *lruvec; >> + >> +again: >> + lruvec = mem_cgroup_page_lruvec(page, pgdat); >> + spin_lock_irq(&lruvec->lru_lock); > > This isn't safe. Nothing prevents the page from being moved to a > different memcg in between these two operations, and the lruvec having > been freed by the time you try to acquire the spinlock. > > You need to use the rcu lock to dereference page->mem_cgroup and its > lruvec when coming from the page like this. Hi Johannes, Yes, you are right. Thank a lot to point out this! Could we use rcu lock to guard the lruvec till lruvec->lru_lock getten? +struct lruvec *lock_page_lruvec_irq(struct page *page, + struct pglist_data *pgdat) +{ + struct lruvec *lruvec; + +again: + rcu_read_lock(); + lruvec = mem_cgroup_page_lruvec(page, pgdat); + spin_lock_irq(&lruvec->lru_lock); + rcu_read_unlock(); + + /* lruvec may changed in commit_charge() */ + if (lruvec != mem_cgroup_page_lruvec(page, pgdat)) { + spin_unlock_irq(&lruvec->lru_lock); + goto again; + } + + return lruvec; +} > > You also need to use page_memcg_rcu() to insert the appropriate > lockless access barriers, which mem_cgroup_page_lruvec() does not do > since it's designed for use with pgdat->lru_lock. > Since the page_memcg_rcu can not know it under a spin_lock, guess the following enhance is fine: @@ -1225,7 +1224,7 @@ struct lruvec *mem_cgroup_page_lruvec(struct page *page, struct pglist_data *pgd goto out; } - memcg = page->mem_cgroup; + memcg = READ_ONCE(page->mem_cgroup); /* Millions thanks! Alex