Re: [PATCH 2/5] Add per cgroup reclaim watermarks.

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On Tue, 18 Jan 2011 13:10:39 -0800
Ying Han <yinghan@xxxxxxxxxx> wrote:

> On Tue, Jan 18, 2011 at 12:36 PM, David Rientjes <rientjes@xxxxxxxxxx> wrote:
> > On Tue, 18 Jan 2011, Ying Han wrote:
> >
> >> I agree that "min_free_kbytes" concept doesn't apply well since there
> >> is no notion of "reserved pool" in memcg. I borrowed it at the
> >> beginning is to add a tunable to the per-memcg watermarks besides the
> >> hard_limit.
> >
> > You may want to add a small amount of memory that a memcg may allocate
> > from in oom conditions, however: memory reserves are allocated per-zone
> > and if the entire system is oom and that includes several dozen memcgs,
> > for example, they could all be contending for the same memory reserves.
> > It would be much easier to deplete all reserves since you would have
> > several tasks allowed to allocate from this pool: that's not possible
> > without memcg since the oom killer is serialized on zones and does not
> > kill a task if another oom killed task is already detected in the
> > tasklist.
> 
> so something like per-memcg min_wmark which also needs to be reserved upfront?
> 

I think the variable name 'min_free_kbytes' is the source of confusion...
It's just a watermark to trigger background reclaim. It's not reservation.


> > I think it would be very trivial to DoS the entire machine in this way:
> > set up a thousand memcgs with tasks that have core_state, for example, and
> > trigger them to all allocate anonymous memory up to their hard limit so
> > they oom at the same time. ÂThe machine should livelock with all zones
> > having 0 pages free.
> >
> >> I read the
> >> patch posted from Satoru Moriya "Tunable watermarks", and introducing
> >> the per-memcg-per-watermark tunable
> >> sounds good to me. Might consider adding it to the next post.
> >>
> >
> > Those tunable watermarks were nacked for a reason: they are internal to
> > the VM and should be set to sane values by the kernel with no intevention
> > needed by userspace. ÂYou'd need to show why a memcg would need a user to
> > tune its watermarks to trigger background reclaim and why that's not
> > possible by the kernel and how this is a special case in comparsion to the
> > per-zone watermarks used by the VM.
> 
> KAMEZAWA gave an example on his early post, which some enterprise user
> like to keep fixed amount of free pages
> regardless of the hard_limit.
> 
> Since setting the wmarks has impact on the reclaim behavior of each
> memcg,  adding this flexibility helps the system where it like to
> treat memcg differently based on the priority.
> 

Please add some tricks to throttle the usage of cpu by kswapd-for-memcg
even when the user sets some bad value. And the total number of threads/workers
for all memcg should be throttled, too. (I think this parameter can be 
sysctl or root cgroup parameter.)

Thanks,
-Kame

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@xxxxxxxxxx  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Fight unfair telecom policy in Canada: sign http://dissolvethecrtc.ca/
Don't email: <a href=mailto:"dont@xxxxxxxxx";> email@xxxxxxxxx </a>


[Index of Archives]     [Linux ARM Kernel]     [Linux ARM]     [Linux Omap]     [Fedora ARM]     [IETF Annouce]     [Bugtraq]     [Linux]     [Linux OMAP]     [Linux MIPS]     [ECOS]     [Asterisk Internet PBX]     [Linux API]