Re: [RFC] [PATCH 2/7 v2] memcg: add memory barrier for checking account move.

Ying Han <yinghan@xxxxxxxxxx> · Tue, 24 Jan 2012 11:04:16 -0800

On Mon, Jan 23, 2012 at 1:04 AM, Michal Hocko <mhocko@xxxxxxx> wrote:
> On Fri 20-01-12 10:08:44, Ying Han wrote:
>> On Wed, Jan 18, 2012 at 6:17 PM, KAMEZAWA Hiroyuki
>> <kamezawa.hiroyu@xxxxxxxxxxxxxx> wrote:
>> > On Wed, 18 Jan 2012 13:37:59 +0100
>> > Michal Hocko <mhocko@xxxxxxx> wrote:
>> >
>> >> On Wed 18-01-12 09:06:56, KAMEZAWA Hiroyuki wrote:
>> >> > On Tue, 17 Jan 2012 16:26:35 +0100
>> >> > Michal Hocko <mhocko@xxxxxxx> wrote:
>> >> >
>> >> > > On Fri 13-01-12 17:33:47, KAMEZAWA Hiroyuki wrote:
>> >> > > > I think this bugfix is needed before going ahead. thoughts?
>> >> > > > ==
>> >> > > > From 2cb491a41782b39aae9f6fe7255b9159ac6c1563 Mon Sep 17 00:00:00 2001
>> >> > > > From: KAMEZAWA Hiroyuki <kamezawa.hiroyu@xxxxxxxxxxxxxx>
>> >> > > > Date: Fri, 13 Jan 2012 14:27:20 +0900
>> >> > > > Subject: [PATCH 2/7] memcg: add memory barrier for checking account move.
>> >> > > >
>> >> > > > At starting move_account(), source memcg's per-cpu variable
>> >> > > > MEM_CGROUP_ON_MOVE is set. The page status update
>> >> > > > routine check it under rcu_read_lock(). But there is no memory
>> >> > > > barrier. This patch adds one.
>> >> > >
>> >> > > OK this would help to enforce that the CPU would see the current value
>> >> > > but what prevents us from the race with the value update without the
>> >> > > lock? This is as racy as it was before AFAICS.
>> >> > >
>> >> >
>> >> > Hm, do I misunderstand ?
>> >> > ==
>> >> >    update                     reference
>> >> >
>> >> >    CPU A                        CPU B
>> >> >   set value                rcu_read_lock()
>> >> >   smp_wmb()                smp_rmb()
>> >> >                            read_value
>> >> >                            rcu_read_unlock()
>> >> >   synchronize_rcu().
>> >> > ==
>> >> > I expect
>> >> > If synchronize_rcu() is called before rcu_read_lock() => move_lock_xxx will be held.
>> >> > If synchronize_rcu() is called after rcu_read_lock() => update will be delayed.
>> >>
>> >> Ahh, OK I can see it now. Readers are not that important because it is
>> >> actually the updater who is delayed until all preexisting rcu read
>> >> sections are finished.
>> >>
>> >> In that case. Why do we need both barriers? spin_unlock is a full
>> >> barrier so maybe we just need smp_rmb before we read value to make sure
>> >> that we do not get stalled value when we start rcu_read section after
>> >> synchronize_rcu?
>> >>
>> >
>> > I doubt .... If no barrier, this case happens
>> >
>> > ==
>> >        update                  reference
>> >        CPU A                   CPU B
>> >        set value
>> >        synchronize_rcu()       rcu_read_lock()
>> >                                read_value <= find old value
>> >                                rcu_read_unlock()
>> >                                do no lock
>> > ==
>>
>> Hi Kame,
>>
>> Can you help to clarify a bit more on the example above? Why
>> read_value got the old value after synchronize_rcu().
>
> AFAIU it is because rcu_read_unlock doesn't force any memory barrier
> and we synchronize only the updater (with synchronize_rcu), so nothing
> guarantees that the value set on CPUA is visible to CPUB.

Thanks, and i might have found similar comment on the
documentation/rcu/checklist.txt:
"
The various RCU read-side primitives do -not- necessarily contain
memory barriers.
"

So, the read barrier here is to make sure no reordering between the
reader and the rcu_read_lock. The same for the write barrier which
makes sure no reordering between the updater and synchronize_rcu. The
the rcu here is to synchronize between the updater and reader. If so,
why not the change like :

       for_each_online_cpu(cpu)
               per_cpu(memcg->stat->count[MEM_CGROUP_ON_MOVE], cpu) += 1;
+      smp_wmb();

Sorry, the use of per-cpu variable MEM_CGROUP_ON_MOVE does confuse me.

--Ying
>
> --
> Michal Hocko
> SUSE Labs
> SUSE LINUX s.r.o.
> Lihovarska 1060/12
> 190 00 Praha 9
> Czech Republic

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@xxxxxxxxx.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Fight unfair telecom internet charges in Canada: sign http://stopthemeter.ca/
Don't email: <a href