On 19/10/10 20:29, Tim Small wrote: > Sprinkled a few more printks.... > > http://buttersideup.com/files/md-raid1-lockup-lvm-snapshot/dmesg-deadlock-instrumented.txt > It seems that when the system is hung, conf->nr_pending gets stuck with a value of 2. The resync task ends up stuck in the second wait_event_lock_irq within raise barrier, and everything else gets stuck in the first wait_event_lock_irq when waiting for that to complete.. So my assumption is that some IOs either get stuck incomplete, or take a path through the code such that they complete without calling allow_barrier. Does that make any sense? Cheers, Tim. -- South East Open Source Solutions Limited Registered in England and Wales with company number 06134732. Registered Office: 2 Powell Gardens, Redhill, Surrey, RH1 1TQ VAT number: 900 6633 53 http://seoss.co.uk/ +44-(0)1273-808309 -- To unsubscribe from this list: send the line "unsubscribe linux-raid" in the body of a message to majordomo@xxxxxxxxxxxxxxx More majordomo info at http://vger.kernel.org/majordomo-info.html