On 08/13, Russell King - ARM Linux wrote: > On Wed, Aug 13, 2014 at 06:57:18AM -0700, Stephen Boyd wrote: > > Commit 1a6b69b6548c (ARM: gic: add CPU migration support, > > 2012-04-12) introduced an acquisition of the irq_controller_lock > > in gic_raise_softirq() which can lead to a spinlock recursion if > > the gic_arch_extn hooks call into the scheduler (via complete() > > or wake_up(), etc.). This happens because gic_arch_extn hooks are > > normally called with the irq_controller_lock held and calling > > into the scheduler may cause us to call smp_send_reschedule() > > which will grab the irq_controller_lock again. Here's an example > > from a vendor kernel (note that the gic_arch_extn hook code here > > isn't actually in mainline): > > Here's a question: why would you want to call into the scheduler from > the gic_arch_extn code? In this case we want to send a message to another processor when an interrupt is enabled that's only a wakeup interrupt in certain low power states. It's done sort of indirectly, but basically we block that low power state from being entered so we can ensure that the interrupt wakes us up from a lighter version of suspend. > > Oh. My. God. Thomas, what have you done to the generic IRQ layer? > This is /totally/ unsafe: > > void disable_irq(unsigned int irq) > { > if (!__disable_irq_nosync(irq)) > synchronize_irq(irq); > } > > static int __disable_irq_nosync(unsigned int irq) > { > unsigned long flags; > struct irq_desc *desc = irq_get_desc_buslock(irq, &flags, IRQ_GET_DESC_CHECK_GLOBAL); We got the lock here. > > if (!desc) > return -EINVAL; > __disable_irq(desc, irq, false); > irq_put_desc_busunlock(desc, flags); > return 0; > } > > void __disable_irq(struct irq_desc *desc, unsigned int irq, bool suspend) > { > if (suspend) { > if (!desc->action || (desc->action->flags & IRQF_NO_SUSPEND)) > return; > desc->istate |= IRQS_SUSPENDED; > } > > if (!desc->depth++) We're still holding the lock here right? > irq_disable(desc); > } > > You realise that disable_irq() and enable_irq() can be called by > concurrently by different drivers for the /same/ interrupt. Sure. > For > starters, that post-increment there is completely unprotected against > races. Secondly, the above is completely racy against a concurrent > enable_irq() - what if we're in disable_irq(), we've incremented > depth, but have yet to call irq_disable(). The count now has a > value of 1. > > We then preempt, and run another thread which calls enable_irq() > on it. This results in the depth being decremented, and the IRQ > is now enabled. How? Aren't we holding the descriptor lock? > > We resume the original thread, and continue to call irq_disable(), > resulting in the interrupt being disabled. > > That's not nice (the right answer is that it's strictly an unbalanced > enable_irq(), but that's no excuse here.) > -- Qualcomm Innovation Center, Inc. is a member of Code Aurora Forum, hosted by The Linux Foundation -- To unsubscribe from this list: send the line "unsubscribe linux-arm-msm" in the body of a message to majordomo@xxxxxxxxxxxxxxx More majordomo info at http://vger.kernel.org/majordomo-info.html