Another update: I have been trying some various combinations to see under what circumstances I can make things lock up. The main discovery: using ext4 instead of xfs, I cannot get the server to lock up - after 36 hours of continuous testing anyway. With xfs and everything else identical, it typically locks up within 10 minutes. This is not to say that xfs is at fault. It may be that xfs generates a higher peak load of I/O ops or something, and that tickles the problem. In any case I see a mixture of unkillable processes: not only bonnie++ and xfsaild but I have also seen kswapd, kworker, irqbalance, even postfix processes (which should not even be touching the 24-disk array; there is a separate system disk directly connected to the motherboard's own SATA controller) The test is running four concurrent bonnie++ sessions in separate screen sessions. Some of the tests performed: - 24 SATA disks, LSI HBAs, md RAID0, XFS: rapid lockup - 24 SATA disks, LSI HBAs, md RAID0, ext4: no lockup seen so far - 2 SATA disks, LSI HBAs, md RAID0, XFS: no lockup - 1 system SATA disk, motherboard SATA, no RAID, ext4: no lockup I did also write a ruby script to do lots of concurrent dd reads (at random offsets) directly from the array. I wasn't able to replicate the problem with that. This is with Seagate 7200rpm drives, and the total I/O bandwidth I can see is quite a lot (see iostat below). I can also replicate the problem in a similar system with Hitachi "coolspin" (5940rpm?) drives, but it seems to take somewhat longer, maybe an hour or two, so perhaps the peak I/O ops is something to do with it? (These systems do have only 8GB RAM, so I also wondered if it was something to do with deadlocking when allocating buffer space if not enough was available) Regards, Brian. avg-cpu: %user %nice %system %iowait %steal %idle 1.55 0.00 80.67 12.63 0.00 5.15 Device: tps kB_read/s kB_wrtn/s kB_read kB_wrtn sda 0.40 0.00 2.40 0 12 sdf 187.80 17817.60 109416.80 89088 547084 sde 182.60 17817.60 112051.20 89088 560256 sdd 183.00 17817.60 108800.00 89088 544000 sdc 167.00 17612.80 105840.80 88064 529204 sdg 162.80 17612.80 107735.20 88064 538676 sdh 180.00 18022.40 112230.40 90112 561152 sdp 168.20 17408.00 107929.60 87040 539648 sdj 179.60 17614.40 111346.40 88072 556732 sdq 174.20 17408.00 108544.00 87040 542720 sdk 201.60 17612.80 111206.40 88064 556032 sdb 189.20 17819.20 108800.00 89096 544000 sdl 195.60 17542.40 110387.20 87712 551936 sdo 196.00 17408.00 111206.40 87040 556032 sdm 200.00 17408.00 110796.80 87040 553984 sdn 189.00 17408.00 108544.00 87040 542720 sdi 168.60 18022.40 112025.60 90112 560128 sdr 192.60 17819.20 111858.40 89096 559292 sdu 193.80 17612.80 108953.60 88064 544768 sdv 202.60 17612.80 108851.20 88064 544256 sdw 178.20 17612.80 108953.60 88064 544768 sdy 191.60 17612.80 110796.80 88064 553984 sdx 196.00 17612.80 111616.00 88064 558080 sds 182.80 17612.80 109158.40 88064 545792 sdt 191.60 17203.20 111219.20 86016 556096 md127 7569.80 415064.00 2620999.20 2075320 13104996 _______________________________________________ xfs mailing list xfs@xxxxxxxxxxx http://oss.sgi.com/mailman/listinfo/xfs