Re: [PATCH] prevent sparc64 from invoking irq handlers on offline CPUs

David Miller <davem@xxxxxxxxxxxxx> · Wed, 03 Sep 2008 02:21:38 -0700 (PDT)

From: "Paul E. McKenney" <paulmck@xxxxxxxxxxxxxxxxxx>
Date: Tue, 2 Sep 2008 17:42:11 -0700

> On Tue, Sep 02, 2008 at 05:16:30PM -0700, David Miller wrote:
> > So I'd like to hold off on this patch until this locking issue is
> > resolved.
> 
> OK, it is your architecture.  But in the meantime, sparc64 can take
> interrupts on CPUs whose cpu_online_map bits have been cleared.

Paul, here is how I resolved this in my tree.

First, I applied a patch that killed that 'call_lock' and replaced
the accesses with ipi_call_lock() and ipi_call_unlock().

Then I sed'd up your patch so that it applies properly after that
change.

I still think there will be a problem here on sparc64.  I had the
online map clearing there happening first because the fixup_irqs()
thing doesn't drain interrupts.  It just makes sure that "device"
interrupts no longer point at the cpu.  So all new device interrupts
after fixup_irqs() will not go to the cpu.

Then we do the:

	local_irq_enable();
	mdelay(1);
	local_irq_disable();

thing to process any interrupts which were sent while we were
retargetting the device IRQs.

I also intended this to drain the cross-call interrupts too, that's
why I cleared the cpu_online_map() bit before fixup_irqs() and
the above "enable/disable" sequence runs.

With your change in there now, IPIs won't get drained and the system
might get stuck as a result.

I wonder if it would work if we cleared the cpu_online_map right
before the "enable/disable" sequence, but after fixup_irqs()?

Paul, what do you think?
--
To unsubscribe from this list: send the line "unsubscribe sparclinux" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at  http://vger.kernel.org/majordomo-info.html