Scenario: dual opteron/4G/Ubuntu pure 64bit SMP / OS on separate IDE drive, 3ware 8506-8port driving 8x WD2500JD disks in Chenbro hotswap cages as RAID5, config'ed as both reiserfs (pre-catastrophe) and ext3 (postcatastrophe). I'm responsible for getting this system up (done) and reliable (not done). The short version is that it ran well for a few weeks until we discovered on a reboot that a disk had silently failed, degrading the RAID5. In trying to repair that failure, 3ware's 3dm2 software that indicated that it was repairing the array, but failed to do so, causing the loss of the entire array. I tried to rescue the data with reiserfs's fsck but was only able to recover individual chunks. Since most of the info was huge binary files and most of it was backed up elsewhere, we decided not to attempt to rescue anything and we re-formatted with ext3, supposedly bc it was considered more reliable and better suited for large files. After that, the raid stayed up for a day or so and I loaded it down with huge disk i/o, trying to see what would happen. The same port / disk # failed again (tho at least this time the SW notified us), but this seems pretty suspicious that it's the same port number failing. I played around with the motherboard Silicon Image 4port SATA controller and sw raid (via mdadm) for a while and found that after a certain amount of futzing, it looked not too bad, but the amount of futzing made me a bit nervous, especially since someone else is going to have to care for it. The speed of the SW RAID was about 10-20% better than the 3ware by bonnie++, but I liked the idea on having the RAID looks like big scsi disk. So I went for the 3ware. I'll detail the complete catastrophe later (already written up in large chunks - just have to remove some inflammatory language before posting), but my question to the group is what people think of 3ware's support. The common opinion on 3ware seems to be that it's great that they support Linux and the HW works fine (also my experience), but my opinion has been shaded considerably by what happens when a RAID fails - when you really DO need to recover and you need a straightforward path to do so. In short, I've found 3ware support for recovery procedures to be hard to find (via google for example and also on their website), hard to understand because of some peculiar nomenclature, and sometimes misleading due to oddities of their software. Is this just my experience, or is this a widely held view? I realize that I'm talking to a group that seems to be heavily weighted towards SW RAID, but maybe it's just me. If anyone can compare recovery paths between the 2 (SW vs 3ware HW) I'd be very happy to hear the stories. Given this recent experience, I'm re-evaluating whether I should switch back and go SW RAID, especially given another large catastrophe involving 3ware ccontrollers on campus. Have people found that the Chenbro hotswap cages are a contributing factor to RAID failure? That's what one 3wware person indicated. -- Cheers, Harry Harry J Mangalam - 949 856 2847 (vox; email for fax) - hjm@xxxxxxxxx <<plain text preferred>> - To unsubscribe from this list: send the line "unsubscribe linux-raid" in the body of a message to majordomo@xxxxxxxxxxxxxxx More majordomo info at http://vger.kernel.org/majordomo-info.html