On Tue, 18 Jan 2011, Ying Han wrote: > I agree that "min_free_kbytes" concept doesn't apply well since there > is no notion of "reserved pool" in memcg. I borrowed it at the > beginning is to add a tunable to the per-memcg watermarks besides the > hard_limit. You may want to add a small amount of memory that a memcg may allocate from in oom conditions, however: memory reserves are allocated per-zone and if the entire system is oom and that includes several dozen memcgs, for example, they could all be contending for the same memory reserves. It would be much easier to deplete all reserves since you would have several tasks allowed to allocate from this pool: that's not possible without memcg since the oom killer is serialized on zones and does not kill a task if another oom killed task is already detected in the tasklist. I think it would be very trivial to DoS the entire machine in this way: set up a thousand memcgs with tasks that have core_state, for example, and trigger them to all allocate anonymous memory up to their hard limit so they oom at the same time. The machine should livelock with all zones having 0 pages free. > I read the > patch posted from Satoru Moriya "Tunable watermarks", and introducing > the per-memcg-per-watermark tunable > sounds good to me. Might consider adding it to the next post. > Those tunable watermarks were nacked for a reason: they are internal to the VM and should be set to sane values by the kernel with no intevention needed by userspace. You'd need to show why a memcg would need a user to tune its watermarks to trigger background reclaim and why that's not possible by the kernel and how this is a special case in comparsion to the per-zone watermarks used by the VM. -- To unsubscribe, send a message with 'unsubscribe linux-mm' in the body to majordomo@xxxxxxxxxx For more info on Linux MM, see: http://www.linux-mm.org/ . Fight unfair telecom policy in Canada: sign http://dissolvethecrtc.ca/ Don't email: <a href=mailto:"dont@xxxxxxxxx"> email@xxxxxxxxx </a>