On Monday March 13, patrik@xxxxxxxxxxx wrote: > Hi all, > > I just experienced some kind of lockup accessing my 8-drive raid5 > (2.6.16-rc4-mm2). The system has been up for 16 days running fine, but > now processes that try to read the md device hang. ps tells me they are > all sleeping in get_active_stripe. There is nothing in the syslog, and I > can read from the individual drives fine with dd. mdadm says the state > is "active". Hmmm... That's sad. That's going to be very hard to track down. If you could echo t > /proc/sysrq-trigger and send me the dump that appears in the kernel log, I would appreciate it. I doubt it will be very helpful, but it is the best bet I can come up with. > > I'm not sure what to do now. Is it safe to try to reboot the system or > could that cause the device to get corrupted if it's hung in the middle > of some important operation? You could try increasing the size of the stripe cache echo 512 > /sys/block/mdX/md/stripe_cache_size (choose and appropriate 'X'). Maybe check the content of /sys/block/mdX/md/stripe_cache_active as well. Other than that, just reboot. The raid5 will do a resync, but the data should be fine. NeilBrown - To unsubscribe from this list: send the line "unsubscribe linux-raid" in the body of a message to majordomo@xxxxxxxxxxxxxxx More majordomo info at http://vger.kernel.org/majordomo-info.html