On Friday May 26, dean@xxxxxxxxxx wrote: > On Tue, 23 May 2006, Neil Brown wrote: > > i applied them against 2.6.16.18 and two days later i got my first hang... > below is the stripe_cache foo. > > thanks > -dean > > neemlark:~# cd /sys/block/md4/md/ > neemlark:/sys/block/md4/md# cat stripe_cache_active > 255 > 0 preread > bitlist=0 delaylist=255 > neemlark:/sys/block/md4/md# cat stripe_cache_active > 255 > 0 preread > bitlist=0 delaylist=255 > neemlark:/sys/block/md4/md# cat stripe_cache_active > 255 > 0 preread > bitlist=0 delaylist=255 Thanks. This narrows it down quite a bit... too much infact: I can now say for sure that this cannot possible happen :-) Two things that might be helpful: 1/ Do you have any other patches on 2.6.16.18 other than the 3 I sent you? If you do I'd like to see them, just in case. 2/ The message.gz you sent earlier with the echo t > /proc/sysrq-trigger trace in it didn't contain information about md4_raid5 - the controlling thread for that array. It must have missed out due to a buffer overflowing. Next time it happens, could you to get this trace again and see if you can find out what what md4_raid5 is going. Maybe do the 'echo t' several times. I think that you need a kernel recompile to make the dmesg buffer larger. Thanks for your patience - this must be very frustrating for you. NeilBrown - To unsubscribe from this list: send the line "unsubscribe linux-raid" in the body of a message to majordomo@xxxxxxxxxxxxxxx More majordomo info at http://vger.kernel.org/majordomo-info.html