On 04/22/2011 11:33 PM, Ying Han wrote:
Now we would like to launch another job C, since we know there are A(16G - 10G) + B(16G - 10G) = 12G "cold" memory can be reclaimed (w/o impacting the A and B's performance). So what will happen 1. start running C on the host, which triggers global memory pressure right away. If the reclaim is fast, C start growing with the free pages from A and B. However, it might be possible that the reclaim can not catch-up with the job's page allocation. We end up with either OOM condition or performance spike on any of the running jobs. One way to improve it is to set a wmark on either A/B to be proactively reclaiming pages before launching C. The global memory pressure won't help much here since we won't trigger that. min_free_kbytes more or less indirectly provides the same on a global level, but I don't think anybody tunes it just for aggressiveness of background reclaim.
This sounds like yet another reason to have a tunable that can increase the gap between min_free_kbytes and low_free_kbytes (automatically scaled to size in every zone). The realtime people want this to reduce allocation latencies. I want it for dynamic virtual machine resizing, without the memory fragmentation inherent in balloons (which would destroy the performance benefit of transparent hugepages). Now Google wants it for job placement. Is there any good reason we can't have a low watermark equivalent to min_free_kbytes? :) -- All rights reversed -- To unsubscribe, send a message with 'unsubscribe linux-mm' in the body to majordomo@xxxxxxxxxx For more info on Linux MM, see: http://www.linux-mm.org/ . Fight unfair telecom internet charges in Canada: sign http://stopthemeter.ca/ Don't email: <a href=mailto:"dont@xxxxxxxxx"> email@xxxxxxxxx </a>