On Thu, Jan 20, 2011 at 10:06:54AM +0900, KAMEZAWA Hiroyuki wrote: > On Wed, 19 Jan 2011 13:03:19 +0100 > Johannes Weiner <hannes@xxxxxxxxxxx> wrote: > > > The placement of the read-side barrier is confused: the writer first > > sets pc->mem_cgroup, then PCG_USED. The read-side barrier has to be > > between testing PCG_USED and reading pc->mem_cgroup. > > > > Signed-off-by: Johannes Weiner <hannes@xxxxxxxxxxx> > > --- > > mm/memcontrol.c | 27 +++++++++------------------ > > 1 files changed, 9 insertions(+), 18 deletions(-) > > > > I am a bit dumbfounded as to why this has never had any impact. I see > > two scenarios where charging can race with LRU operations: > > > > One is shmem pages on swapoff. They are on the LRU when charged as > > page cache, which could race with isolation/putback. This seems > > sufficiently rare. > > > > The other case is a swap cache page being charged while somebody else > > had it isolated. mem_cgroup_lru_del_before_commit_swapcache() would > > see the page isolated and skip it. The commit then has to race with > > putback, which could see PCG_USED but not pc->mem_cgroup, and crash > > with a NULL pointer dereference. This does sound a bit more likely. > > > > Any idea? Am I missing something? > > > > I think troubles happen only when PCG_USED bit was found but pc->mem_cgroup > is NULL. Hmm. Correct. Well, or get linked to the wrong LRU list and subsequently prevent removal of both cgroups because it's impossible to empty them. > set pc->mem_cgroup > write_barrier > set USED bit. > > read_barrier > check USED bit > access pc->mem_cgroup > > So, is there a case which only USED bit can be seen ? That's what I am not quite sure about. As said, I think it can happen when swap cache charging races with reclaim or migration. When the two loads of the used bit and the memcg pointer get reordered, it could observe a set PCG_USED and a stale/unset pc->mem_cgroup. For example: swap minor fault: vmscan: lookup_swap_cache() unlock_page() lock_page() mem_cgroup_try_charge_swapin() putback_lru_page() mem_cgroup_add_lru_list() p = pc->mem_cgroup pc->mem_cgroup = FOO smp_wmb() pc->flags |= PCG_USED f = pc->flags if (!(f & PCG_USED)) return *p /* bang */ > Anyway, your patch is right. > > Acked-by: KAMEZAWA Hiroyuki <kamezawa.hiroyu@xxxxxxxxxxxxxx> Thank you. Hannes -- To unsubscribe, send a message with 'unsubscribe linux-mm' in the body to majordomo@xxxxxxxxxx For more info on Linux MM, see: http://www.linux-mm.org/ . Fight unfair telecom policy in Canada: sign http://dissolvethecrtc.ca/ Don't email: <a href=mailto:"dont@xxxxxxxxx"> email@xxxxxxxxx </a>