Good morning Chris, Very good report, btw. On 08/19/2016 04:18 PM, Chris Maxwell wrote: [trim /] > The machine has a 3ware Hardware RAID controller which is showing sdb > and sdc as disks. (Unit 0 and Unit 1). > Unit 0 (sdb) is made up of > Phy 0: WD WCAW35791262 > Phy 1: Seagate 9QJ7N744 > > Unit 1(sdc) is made up of > Phy 2: Seagate 9QJ7F3PJ and > Phy 3: Seagate 9QJ7R3Y1 > > These are then combined into mirror md0 made of sdb1 and sdc1 > This is the physical volume for LVM VG lvm-raid, which then has LV inside: > lvmdata1 and gokcen The models of the disks would be useful information, too. Your dmesg indicates unit 3 is very slow to report UREs, which means its probably a desktop drive, not a raid drive. I don't have much hardware raid experience, but I do know that smartctl won't report properly on devices connected to hardware raid without additional options on the command line. You need to do this to get a smartctl -x report on all of these devices. It is unclear from your description if the Phy0/1 pair are mirrored themselves or striped. Same with Phy2/3. Do you have a net four copies of your data or a net two copies of your data? > ========================================================================== > Figure 2: mdadm —examine of /dev/sdb1 and sdc1: > > # mdadm --examine /dev/sd[bc]1 >> raid.status.latest > > /dev/sdb1: > Magic : a92b4efc > Version : 0.90.00 > UUID : fdd98007:78663948:0760cb1c:ce437c35 > Creation Time : Mon Oct 18 10:54:29 2010 > Raid Level : raid1 > Used Dev Size : 976551040 (931.31 GiB 999.99 GB) > Array Size : 976551040 (931.31 GiB 999.99 GB) > Raid Devices : 2 > Total Devices : 2 > Preferred Minor : 0 > > Update Time : Thu Aug 4 12:11:38 2016 > State : clean > Active Devices : 1 > Working Devices : 2 > Failed Devices : 0 > Spare Devices : 1 > Checksum : d3a40069 - correct > Events : 858 > > > Number Major Minor RaidDevice State > this 2 8 17 2 spare /dev/sdb1 > > 0 0 0 0 0 removed > 1 1 8 33 1 active sync /dev/sdc1 > 2 2 8 17 2 spare /dev/sdb1 > /dev/sdc1: > Magic : a92b4efc > Version : 0.90.00 > UUID : fdd98007:78663948:0760cb1c:ce437c35 > Creation Time : Mon Oct 18 10:54:29 2010 > Raid Level : raid1 > Used Dev Size : 976551040 (931.31 GiB 999.99 GB) > Array Size : 976551040 (931.31 GiB 999.99 GB) > Raid Devices : 2 > Total Devices : 2 > Preferred Minor : 0 > > Update Time : Thu Aug 4 12:11:38 2016 > State : clean > Active Devices : 1 > Working Devices : 2 > Failed Devices : 0 > Spare Devices : 1 > Checksum : d3a4007d - correct > Events : 858 > > > Number Major Minor RaidDevice State > this 1 8 33 1 active sync /dev/sdc1 > > 0 0 0 0 0 removed > 1 1 8 33 1 active sync /dev/sdc1 > 2 2 8 17 2 spare /dev/sdb1 I suspect that your array has scattered UREs on desktop drives. The hardware raid isn't kicking the drives out after 30 seconds like software raid would (see reference threads below), but instead allows the problem to persist. If you have a 4-way mirror, plugging these drives directly into a mobo w/ the driver timeout work-around might be the best way to safely recover your data. The 3ware card certainly isn't behaving the way I would predict, which means my advice isn't valid with it in the mix. If a hardware raid expert pipes up with alternatives, that would be helpful. Meanwhile, please supply the smartctl -x reports. Just paste them in your reply w/ line wrap disabled. Phil Readings for timeout mismatch issues: (whole threads if possible) http://marc.info/?l=linux-raid&m=139050322510249&w=2 http://marc.info/?l=linux-raid&m=135863964624202&w=2 http://marc.info/?l=linux-raid&m=135811522817345&w=1 http://marc.info/?l=linux-raid&m=133761065622164&w=2 http://marc.info/?l=linux-raid&m=132477199207506 http://marc.info/?l=linux-raid&m=133665797115876&w=2 http://marc.info/?l=linux-raid&m=142487508806844&w=3 http://marc.info/?l=linux-raid&m=144535576302583&w=2 -- To unsubscribe from this list: send the line "unsubscribe linux-raid" in the body of a message to majordomo@xxxxxxxxxxxxxxx More majordomo info at http://vger.kernel.org/majordomo-info.html