Hi. I'd like to revisit a problem I put to the mailing list on the 27th July 2008. My linux system hangs if I have a lengthy recovery of a raid-1 device going on at the same time as any significant network traffic. If I terminate my networking applications the re-sync succeeds; if I allow them to run then the re-sync will almost always hang the system. My PC is about 1.5 years old; it has a Gigabyte GA-P35-DS4 motherboard with an Intel Core 2 Quad Q6600 CPU. The motherboard has an Intel ICH9R northbridge with 6 SATA 2 ports and a 'Gigabyte' (JMicron 20360/20363) southbridge with 2 SATA 2 ports. I have two 500GB Western Digital SATA 2 internal disks, both on the ICH9R northbridge, as I used to get occasional SATA disconnects/errors if I had a disk under heavy load on the JMicron controller. The two disks have 400GB partitions in a MD raid1 mirror. I typically experience this problem when I plug in a third disk (also on the ICH9R controller) to synchronise as a backup procedure, but it also happens if I just have the two permanent disks synchronising between themselves. I'm running Linux 2.6.28.6. The motherboard has a Realtek RTL8111/8168B gigabit ethernet controller which I have running in a 100Mbit full duplex link to my ADSL modem. I'm using the kernel's standard r8169 driver for the network. If I have no significant network activity taking place (other than trivial traffic from named, ntpd and the like) then my md1 recoveries always succeed. But if I have a program maxing out the connection to my ISP - about 160KB/sec down, 30KB/sec up - then the re-synchronisation will always end up hanging: o disk I/O stops - the disk activity LED will stop flashing, iostat statistics will drop to zero, 'cat /proc/mdstat' will show dwindling I/O speeds and ever-increasing finish times (from 200 minutes to 30,000+ minutes!). o any access to the filesystem I have mounted on top of the md1 device hangs. o access to OTHER filesystems is fine, and anything independent of the hung filesystem works as normal. There are absolutely no errors reported by the system - nothing logged to the console and nothing logged via syslog (the /var/log filesystem is fully operational even while the recovering one is hung). Looking at /proc/interrupts I can see that the 'eth0' driver has an interrupt all to itself. I haven't had a single SATA disconnect error since I moved all my disks off the JMicron southbridge. I can 'dd' each drive simultaneously with no errors and better than 70MB/sec throughput from each in parallel. Does anyone know of any condition which would cause the md1 recovery process to silently hang like this? Can I get some sort of debug/verbose log out of the raid software to work out why it's hanging? Has anyone ever experienced this sort of problem - md recovery 'sensitivity' to network traffic? - on this motherboard? Thanks, Brad -- To unsubscribe from this list: send the line "unsubscribe linux-raid" in the body of a message to majordomo@xxxxxxxxxxxxxxx More majordomo info at http://vger.kernel.org/majordomo-info.html