On Fri, Jan 06, 2012 at 11:36:11AM +0530, Srivatsa S. Bhat wrote: > On 01/06/2012 03:51 AM, Mel Gorman wrote: > > > (Adding Greg to cc to see if he recalls seeing issues with sysfs dentry > > suffering from recursive locking recently) > > > > On Thu, Jan 05, 2012 at 10:35:04AM -0800, Paul E. McKenney wrote: > >> On Thu, Jan 05, 2012 at 04:35:29PM +0000, Russell King - ARM Linux wrote: > >>> On Thu, Jan 05, 2012 at 04:17:39PM +0000, Mel Gorman wrote: > >>>> Link please? > >>> > >>> Forwarded, as its still in my mailbox. > >>> > >>>> I'm including a patch below under development that is > >>>> intended to only cope with the page allocator case under heavy memory > >>>> pressure. Currently it does not pass testing because eventually RCU > >>>> gets stalled with the following trace > >>>> > >>>> [ 1817.176001] [<ffffffff810214d7>] arch_trigger_all_cpu_backtrace+0x87/0xa0 > >>>> [ 1817.176001] [<ffffffff810c4779>] __rcu_pending+0x149/0x260 > >>>> [ 1817.176001] [<ffffffff810c48ef>] rcu_check_callbacks+0x5f/0x110 > >>>> [ 1817.176001] [<ffffffff81068d7f>] update_process_times+0x3f/0x80 > >>>> [ 1817.176001] [<ffffffff8108c4eb>] tick_sched_timer+0x5b/0xc0 > >>>> [ 1817.176001] [<ffffffff8107f28e>] __run_hrtimer+0xbe/0x1a0 > >>>> [ 1817.176001] [<ffffffff8107f581>] hrtimer_interrupt+0xc1/0x1e0 > >>>> [ 1817.176001] [<ffffffff81020ef3>] smp_apic_timer_interrupt+0x63/0xa0 > >>>> [ 1817.176001] [<ffffffff81449073>] apic_timer_interrupt+0x13/0x20 > >>>> [ 1817.176001] [<ffffffff8116c135>] vfsmount_lock_local_lock+0x25/0x30 > >>>> [ 1817.176001] [<ffffffff8115c855>] path_init+0x2d5/0x370 > >>>> [ 1817.176001] [<ffffffff8115eecd>] path_lookupat+0x2d/0x620 > >>>> [ 1817.176001] [<ffffffff8115f4ef>] do_path_lookup+0x2f/0xd0 > >>>> [ 1817.176001] [<ffffffff811602af>] user_path_at_empty+0x9f/0xd0 > >>>> [ 1817.176001] [<ffffffff81154e7b>] vfs_fstatat+0x4b/0x90 > >>>> [ 1817.176001] [<ffffffff81154f4f>] sys_newlstat+0x1f/0x50 > >>>> [ 1817.176001] [<ffffffff81448692>] system_call_fastpath+0x16/0x1b > >>>> > >>>> It might be a separate bug, don't know for sure. > >> > > > > I rebased the patch on top of 3.2 and tested again with a bunch of > > debugging options set (PROVE_RCU, PROVE_LOCKING etc). Same results. CPU > > hotplug is a lot more reliable and less likely to hang but eventually > > gets into trouble. > > > > I was running some CPU hotplug stress tests recently and found it to be > problematic too. Mel, I have some logs from those tests which appear very > relevant to the "IPI to offline CPU" issue that has been discussed in this > thread. > > Kernel: 3.2-rc7 > Here is the log: > (Unfortunately I couldn't capture the log intact, due to some annoying > serial console issues, but I hope this log is good enough to analyze.) > Ok, it looks vaguely similar to what I'm seeing. I think I spotted the sysfs problem as well and am testing a series. I'll add you to the cc if it passes tests locally. Thanks. -- Mel Gorman SUSE Labs -- To unsubscribe, send a message with 'unsubscribe linux-mm' in the body to majordomo@xxxxxxxxx. For more info on Linux MM, see: http://www.linux-mm.org/ . Fight unfair telecom internet charges in Canada: sign http://stopthemeter.ca/ Don't email: <a href=mailto:"dont@xxxxxxxxx"> email@xxxxxxxxx </a>