From: "Paul E. McKenney" <paulmck@xxxxxxxxxxxxxxxxxx> Date: Tue, 2 Sep 2008 17:42:11 -0700 > On Tue, Sep 02, 2008 at 05:16:30PM -0700, David Miller wrote: > > So I'd like to hold off on this patch until this locking issue is > > resolved. > > OK, it is your architecture. But in the meantime, sparc64 can take > interrupts on CPUs whose cpu_online_map bits have been cleared. Paul, here is how I resolved this in my tree. First, I applied a patch that killed that 'call_lock' and replaced the accesses with ipi_call_lock() and ipi_call_unlock(). Then I sed'd up your patch so that it applies properly after that change. I still think there will be a problem here on sparc64. I had the online map clearing there happening first because the fixup_irqs() thing doesn't drain interrupts. It just makes sure that "device" interrupts no longer point at the cpu. So all new device interrupts after fixup_irqs() will not go to the cpu. Then we do the: local_irq_enable(); mdelay(1); local_irq_disable(); thing to process any interrupts which were sent while we were retargetting the device IRQs. I also intended this to drain the cross-call interrupts too, that's why I cleared the cpu_online_map() bit before fixup_irqs() and the above "enable/disable" sequence runs. With your change in there now, IPIs won't get drained and the system might get stuck as a result. I wonder if it would work if we cleared the cpu_online_map right before the "enable/disable" sequence, but after fixup_irqs()? Paul, what do you think? -- To unsubscribe from this list: send the line "unsubscribe sparclinux" in the body of a message to majordomo@xxxxxxxxxxxxxxx More majordomo info at http://vger.kernel.org/majordomo-info.html