On Thu, Jan 05, 2012 at 02:20:17PM +0000, Mel Gorman wrote: > On Tue, Jan 03, 2012 at 12:45:45PM -0500, KOSAKI Motohiro wrote: > > > void drain_all_pages(void) > > > { > > > - on_each_cpu(drain_local_pages, NULL, 1); > > > + int cpu; > > > + struct per_cpu_pageset *pcp; > > > + struct zone *zone; > > > + > > > > get_online_cpu() ? > > > > Just a separate note; > > I'm looking at some mysterious CPU hotplug problems that only happen > under heavy load. My strongest suspicion at the moment that the problem > is related to on_each_cpu() being used without get_online_cpu() but you > cannot simply call get_online_cpu() in this path without causing > deadlock. Mel, That's a known hotplug problems. PeterZ has a patch which (probably) solves it, but there seems to be very little traction of any kind to merge it. I've been chasing that patch and getting no replies what so ever from folk like Peter, Thomas and Ingo. The problem affects all IPI-raising functions, which mask with cpu_online_mask directly. I'm not sure that smp_call_function() can use get_online_cpu() as it looks like it's not permitted to sleep (it spins in csd_lock_wait if it is to wait for the called function to complete on all CPUs, rather than using a sleepable completion.) get_online_cpu() solves the online mask problem by sleeping until it's safe to access it. So, I think this whole CPU bringup mess needs to be re-thought, and the seemingly constant to pile more and more restrictions onto the bringup path needs resolving. It's got to the point where there's soo many restrictions that actually it's impossible for arch code to simultaneously satisfy them all. -- To unsubscribe, send a message with 'unsubscribe linux-mm' in the body to majordomo@xxxxxxxxx. For more info on Linux MM, see: http://www.linux-mm.org/ . Fight unfair telecom internet charges in Canada: sign http://stopthemeter.ca/ Don't email: <a href=mailto:"dont@xxxxxxxxx"> email@xxxxxxxxx </a>