2009/10/10 David Miller <davem@xxxxxxxxxxxxx>: > From: David Miller <davem@xxxxxxxxxxxxx> > Date: Fri, 09 Oct 2009 15:08:29 -0700 (PDT) > >> Thank you for this bug report and patch, I am looking at >> it now. > > I'm trying to figure out how the deadlock can even occur, > and I've failed so far, please help me :-) > > See, we always take the VIO and LDC locks in the same order > (VIO then LDC) and always with interrupts disabled, so it is > not possible to deadlock. > > The only way we could deadlock is if: > > 1) There is some path that takes the LDC lock before the VIO one. > > 2) There is some path that takes either lock with interrupts > enabled. > > And I cannot find any such case. David Thank you, i try to figure out the path which lead to system hang. i got the output log :(the same output run18350079 times and going forever ... ) ctl: 17, data:0 err:0, abr:0 runcc:18350079 CPUId:0, event: 0x00000004 qall Trace: s[00000000004acb30] _handle_IRQ_event+0x50/0x120 C[00000000004acc70] handle_IRQ_event+0x70/0x120 [00000000004af14c] handle_fasteoi_irq+0xcc/0x180 [000000000042ee54] handler_irq+0x134/0x160 [00000000004208b4] tl0_irq5+0x14/0x20 [00000000004acbac] _handle_IRQ_event+0xcc/0x120 [00000000004acc70] handle_IRQ_event+0x70/0x120 [00000000004af14c] handle_fasteoi_irq+0xcc/0x180 [000000000042ee54] handler_irq+0x134/0x160 [00000000004208b4] tl0_irq5+0x14/0x20 [00000000004acbac] _handle_IRQ_event+0xcc/0x120 [00000000004acc70] handle_IRQ_event+0x70/0x120 [00000000004af14c] handle_fasteoi_irq+0xcc/0x180 [000000000042ee54] handler_irq+0x134/0x160 [00000000004208b4] tl0_irq5+0x14/0x20 [00000000007e7ffc] _spin_unlock_irqrestore+0x3c/0x60 >runcc:18350079 CPUId:0, event: 0x00000004 the runcc is the count of times ldx_rx been run. dump code: static irqreturn_t ldc_rx(int irq, void *dev_id) { .... atomic64_inc(&runcc); .... printk(KERN_INFO"runcc:%lld CPUId:%d, event: 0x%08x\n", atomic_read(&runcc), smp_processor_id(),event_mask); dump_stack(); } look the console output, system seems hang on a live lock: tl0_irq5 triggered just after the irq been re-enable in the handler of irq5: the ldc_rx. i have no idea about the t10_irq5, just guess that: the special configuration lead to t10_irq5 been triggered continuously, and the trigger condition can not been cleared. Pauli He > > It might help if you run your test case with lockdep enabled. It will > find such deadlocks and report them precisely to the kernel logs. > > Thank you! > -- To unsubscribe from this list: send the line "unsubscribe sparclinux" in the body of a message to majordomo@xxxxxxxxxxxxxxx More majordomo info at http://vger.kernel.org/majordomo-info.html