On Thu, Jan 05, 2012 at 02:06:45PM -0800, Andrew Morton wrote: > On Thu, 5 Jan 2012 16:17:39 +0000 > Mel Gorman <mel@xxxxxxxxx> wrote: > > > mm: page allocator: Guard against CPUs going offline while draining per-cpu page lists > > > > While running a CPU hotplug stress test under memory pressure, I > > saw cases where under enough stress the machine would halt although > > it required a machine with 8 cores and plenty memory. I think the > > problems may be related. > > When we first implemented them, the percpu pages in the page allocator > were of really really marginal benefit. I didn't merge the patches at > all for several cycles, and it was eventually a 49/51 decision. > > So I suggest that our approach to solving this particular problem > should be to nuke the whole thing, then see if that caused any > observeable problems. If it did, can we solve those problems by means > other than bringing the dang things back? > Sounds drastic. It would be less controversial to replace this patch with a version that calls get_online_cpu() in drain_all_pages() but remove the call to drain_all_pages() call from the page allocator on the grounds it is not safe against CPU hotplug and to hell with the slightly elevated allocation failure rates and stalls. That would avoid the try_get_online_cpus() crappiness and be less complex. If you really want to consider deleting the per-cpu allocator, maybe it could be a LSF/MM topic? Personally I would be wary of deleting it but mostly because I lack regular access to the type of hardware to evaulate whether it was safe to remove or not. Minimally, removing the per-cpu allocator could make the zone lock very hot even though slub probably makes it very hot already. -- Mel Gorman SUSE Labs -- To unsubscribe from this list: send the line "unsubscribe linux-fsdevel" in the body of a message to majordomo@xxxxxxxxxxxxxxx More majordomo info at http://vger.kernel.org/majordomo-info.html