Hello. I am having a strange issue with md RAID on the 2.6.34 kernel. To be specific, it sometimes locks up the system completely, with the following symptoms: - any attempt to read from an array seems to never return - no errors at all on the server console - in one lock-up episode I had "top" running, which displayed zero CPU load (no mdX_raidX in sight on top of the CPU-load sorted list) - Alt-SysRQ-B works, and allows to reboot the system Now, regarding when this happens. I had two such lock-ups shortly after moving my root FS to RAID5; after the first one I changed the FS from XFS to Ext4 (this did not help), after the second one I disabled NCQ on all drives and the write intent bitmap on the array. After that, it worked for maybe a week of intense reads/writes onto the arrays with no more hangs. Today, I have decided to convert a three-member RAID5 into a four-member RAID6. mdadm segfaulted(!) right after the --grow command, and dmesg had an error about md being unable to overwrite the /sys/.....stripe_cache_size file. (As I understand, this is already fixed in the latest kernel). The array then started rebuilding as 4-member RAID6 seemingly fine, but shortly after, the system locked up in the same manner as described above. Several attempts to do the rebuild after reboots consistently caused the same lock-ups early in the rebuild (at less than 1% done). So for now, I decided to give up and returned the array to its previous RAID5 three-member configuration, which went fine. The configuration: md0 is 3* 1990GB RAID5 md1 is 3* 10GB RAID5 (root FS) Three drives are 2* WD20EADS and 1* Hitachi 2TB drive. Fourth array member I was trying to add to md0, is a RAID0 of two 1TB drives (Seagate and Hitachi). SATA controllers are nForce4 chipset and a PCI-E JMicron JMB363. I am using mdadm 3.1.2 now, and going to try the 2.6.35-rc2 kernel. So, my question is, does anyone have an idea on what could cause this, and what would be the best way to diagnose/fix the lockup problem? Thanks in advance. -- With respect, Roman
Attachment:
signature.asc
Description: PGP signature