Re: [patch rfc] memcg: correctly order reading PCG_USED and pc->mem_cgroup

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On Thu, Jan 20, 2011 at 10:06:54AM +0900, KAMEZAWA Hiroyuki wrote:
> On Wed, 19 Jan 2011 13:03:19 +0100
> Johannes Weiner <hannes@xxxxxxxxxxx> wrote:
> 
> > The placement of the read-side barrier is confused: the writer first
> > sets pc->mem_cgroup, then PCG_USED.  The read-side barrier has to be
> > between testing PCG_USED and reading pc->mem_cgroup.
> > 
> > Signed-off-by: Johannes Weiner <hannes@xxxxxxxxxxx>
> > ---
> >  mm/memcontrol.c |   27 +++++++++------------------
> >  1 files changed, 9 insertions(+), 18 deletions(-)
> > 
> > I am a bit dumbfounded as to why this has never had any impact.  I see
> > two scenarios where charging can race with LRU operations:
> > 
> > One is shmem pages on swapoff.  They are on the LRU when charged as
> > page cache, which could race with isolation/putback.  This seems
> > sufficiently rare.
> > 
> > The other case is a swap cache page being charged while somebody else
> > had it isolated.  mem_cgroup_lru_del_before_commit_swapcache() would
> > see the page isolated and skip it.  The commit then has to race with
> > putback, which could see PCG_USED but not pc->mem_cgroup, and crash
> > with a NULL pointer dereference.  This does sound a bit more likely.
> > 
> > Any idea?  Am I missing something?
> > 
> 
> I think troubles happen only when PCG_USED bit was found but pc->mem_cgroup
> is NULL. Hmm.

Correct.  Well, or get linked to the wrong LRU list and subsequently
prevent removal of both cgroups because it's impossible to empty them.

>   set pc->mem_cgroup
>   write_barrier
>   set USED bit.
> 
>   read_barrier
>   check USED bit
>   access pc->mem_cgroup
> 
> So, is there a case which only USED bit can be seen ?

That's what I am not quite sure about.  As said, I think it can happen
when swap cache charging races with reclaim or migration.

When the two loads of the used bit and the memcg pointer get
reordered, it could observe a set PCG_USED and a stale/unset
pc->mem_cgroup.

For example:

swap minor fault:			vmscan:

lookup_swap_cache()
					unlock_page()
lock_page()
mem_cgroup_try_charge_swapin()
					putback_lru_page()
					mem_cgroup_add_lru_list()
					  p = pc->mem_cgroup
  pc->mem_cgroup = FOO
  smp_wmb()
  pc->flags |= PCG_USED
					  f = pc->flags
					  if (!(f & PCG_USED))
						  return
					  *p /* bang */

> Anyway, your patch is right.
> 
> Acked-by: KAMEZAWA Hiroyuki <kamezawa.hiroyu@xxxxxxxxxxxxxx>

Thank you.

	Hannes

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@xxxxxxxxxx  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Fight unfair telecom policy in Canada: sign http://dissolvethecrtc.ca/
Don't email: <a href=mailto:"dont@xxxxxxxxx";> email@xxxxxxxxx </a>


[Index of Archives]     [Linux ARM Kernel]     [Linux ARM]     [Linux Omap]     [Fedora ARM]     [IETF Annouce]     [Bugtraq]     [Linux]     [Linux OMAP]     [Linux MIPS]     [ECOS]     [Asterisk Internet PBX]     [Linux API]