On Wed, 19 Jan 2011, KAMEZAWA Hiroyuki wrote: > > so something like per-memcg min_wmark which also needs to be reserved upfront? > > > > I think the variable name 'min_free_kbytes' is the source of confusion... > It's just a watermark to trigger background reclaim. It's not reservation. > min_free_kbytes alters the min watermark of zones, meaning it can increase or decrease the amount of memory that is reserved for GFP_ATOMIC allocations, those in irq context, etc. Since oom killed tasks don't allocate from any watermark, it also can increase or decrease the amount of memory available to oom killed tasks. In that case, it _is_ a reservation of memory. The issue is that it's done per-zone and if you're contending for those memory reserves that some oom killed tasks need to exit and free their memory, then it may deplete all memory in the DoS scenario I described. > > KAMEZAWA gave an example on his early post, which some enterprise user > > like to keep fixed amount of free pages > > regardless of the hard_limit. > > > > Since setting the wmarks has impact on the reclaim behavior of each > > memcg, adding this flexibility helps the system where it like to > > treat memcg differently based on the priority. > > > > Please add some tricks to throttle the usage of cpu by kswapd-for-memcg > even when the user sets some bad value. And the total number of threads/workers > for all memcg should be throttled, too. (I think this parameter can be > sysctl or root cgroup parameter.) > I think that you probably want to add a min_free_kbytes for each memcg (and users who choose not to pre-reserve memory for things like oom killed tasks in that cgroup may set it to 0) and then have all other watermarks based off that setting just like the VM currently does whenever the global min_free_kbytes changes. And I agree with your point that some cpu throttling will be needed to not be harmful to other cgroups whenever one memcg continuously hits its low watermark. I'd suggest a global sysctl for that purpose to avoid certain memcg's impacting the preformance of others when under continuous reclaim to make sure everyone's on the same playing field. -- To unsubscribe, send a message with 'unsubscribe linux-mm' in the body to majordomo@xxxxxxxxxx For more info on Linux MM, see: http://www.linux-mm.org/ . Fight unfair telecom policy in Canada: sign http://dissolvethecrtc.ca/ Don't email: <a href=mailto:"dont@xxxxxxxxx"> email@xxxxxxxxx </a>