On Thu, Apr 21, 2011 at 10:00 PM, KAMEZAWA Hiroyuki <kamezawa.hiroyu@xxxxxxxxxxxxxx> wrote:
On Thu, 21 Apr 2011 21:49:04 -0700
Pros <-> Cons ?Ying Han <yinghan@xxxxxxxxxx> wrote:
> On Thu, Apr 21, 2011 at 9:36 PM, KAMEZAWA Hiroyuki <
> kamezawa.hiroyu@xxxxxxxxxxxxxx> wrote:
>
> > On Thu, 21 Apr 2011 21:24:15 -0700
> > Ying Han <yinghan@xxxxxxxxxx> wrote:
> >
> > > This patch creates a thread pool for memcg-kswapd. All memcg which needs
> > > background recalim are linked to a list and memcg-kswapd picks up a memcg
> > > from the list and run reclaim.
> > >
> > > The concern of using per-memcg-kswapd thread is the system overhead
> > including
> > > memory and cputime.
> > >
> > > Signed-off-by: KAMEZAWA Hiroyuki <kamezawa.hiroyu@xxxxxxxxxxxxxx>
> > > Signed-off-by: Ying Han <yinghan@xxxxxxxxxx>
> >
> > Thank you for merging. This seems ok to me.
> >
> > Further development may make this better or change thread pools (to some
> > other),
> > but I think this is enough good.
> >
>
> Thank you for reviewing and Acking. At the same time, I do have wondering on
> the thread-pool modeling which I posted on the cover-letter :)
>
> The per-memcg-per-kswapd model
> Pros:
> 1. memory overhead per thread, and The memory consumption would be 8k*1000 =
> 8M
> with 1k cgroup.
> 2. we see lots of threads at 'ps -elf'
>
> Cons:
> 1. the implementation is simply and straigh-forward.
> 2. we can easily isolate the background reclaim overhead between cgroups.
> 3. better latency from memory pressure to actual start reclaiming
>
> The thread-pool model
> Pros:
> 1. there is no isolation between memcg background reclaim, since the memcg
> threads
> are shared.
> 2. it is hard for visibility and debugability. I have been experienced a lot
> when
> some kswapds running creazy and we need a stright-forward way to identify
> which
> cgroup causing the reclaim.
> 3. potential starvation for some memcgs, if one workitem stucks and the rest
> of work
> won't proceed.
>
> Cons:
> 1. save some memory resource.
>
> In general, the per-memcg-per-kswapd implmentation looks sane to me at this
> point, esepcially the sharing memcg thread model will make debugging issue
> very hard later.
>
> Comments?
>
My idea is adding trace point for memcg-kswapd and seeing what it's now doing.
(We don't have too small trace point in memcg...)
I don't think its sane to create kthread per memcg because we know there is a user
who makes hundreds/thousands of memcg.
And, I think that creating threads, which does the same job, more than the number
of cpus will cause much more difficult starvation, priority inversion issue.
Keeping scheduling knob/chances of jobs in memcg is important. I don't want to
give a hint to scheduler because of memcg internal issue.
And, even if memcg-kswapd doesn't exist, memcg works (well?).
memcg-kswapd just helps making things better but not do any critical jobs.
So, it's okay to have this as best-effort service.
Of course, better scheduling idea for picking up memcg is welcomed. It's now
round-robin.
Hmm. The concern I have is the debug-ability. Let's say I am running a system and found memcg-3 running crazy. Is there a way to find out which memcg it is trying to reclaim pages from? Also, how to count cputime for the shared memcg to the memcgs if we wanted to.
--Ying
Thanks,
-Kame