Re: [PATCH 2/5] Add per cgroup reclaim watermarks.

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On Wed, 19 Jan 2011, KAMEZAWA Hiroyuki wrote:

> > so something like per-memcg min_wmark which also needs to be reserved upfront?
> > 
> 
> I think the variable name 'min_free_kbytes' is the source of confusion...
> It's just a watermark to trigger background reclaim. It's not reservation.
> 

min_free_kbytes alters the min watermark of zones, meaning it can increase 
or decrease the amount of memory that is reserved for GFP_ATOMIC 
allocations, those in irq context, etc.  Since oom killed tasks don't 
allocate from any watermark, it also can increase or decrease the amount 
of memory available to oom killed tasks.  In that case, it _is_ a 
reservation of memory.

The issue is that it's done per-zone and if you're contending for those 
memory reserves that some oom killed tasks need to exit and free their 
memory, then it may deplete all memory in the DoS scenario I described.

> > KAMEZAWA gave an example on his early post, which some enterprise user
> > like to keep fixed amount of free pages
> > regardless of the hard_limit.
> > 
> > Since setting the wmarks has impact on the reclaim behavior of each
> > memcg,  adding this flexibility helps the system where it like to
> > treat memcg differently based on the priority.
> > 
> 
> Please add some tricks to throttle the usage of cpu by kswapd-for-memcg
> even when the user sets some bad value. And the total number of threads/workers
> for all memcg should be throttled, too. (I think this parameter can be 
> sysctl or root cgroup parameter.)
> 

I think that you probably want to add a min_free_kbytes for each memcg 
(and users who choose not to pre-reserve memory for things like oom killed 
tasks in that cgroup may set it to 0) and then have all other watermarks 
based off that setting just like the VM currently does whenever the global 
min_free_kbytes changes.

And I agree with your point that some cpu throttling will be needed to not 
be harmful to other cgroups whenever one memcg continuously hits its low 
watermark.  I'd suggest a global sysctl for that purpose to avoid certain 
memcg's impacting the preformance of others when under continuous reclaim 
to make sure everyone's on the same playing field.

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@xxxxxxxxxx  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Fight unfair telecom policy in Canada: sign http://dissolvethecrtc.ca/
Don't email: <a href=mailto:"dont@xxxxxxxxx";> email@xxxxxxxxx </a>


[Index of Archives]     [Linux ARM Kernel]     [Linux ARM]     [Linux Omap]     [Fedora ARM]     [IETF Annouce]     [Bugtraq]     [Linux]     [Linux OMAP]     [Linux MIPS]     [ECOS]     [Asterisk Internet PBX]     [Linux API]