Re: MD/RAID time out writing superblock

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



Allan Wind wrote:
> On 2009-09-18T00:44:45, Tejun Heo wrote:
>> Hello,
>>
>> Chris Webb wrote:
>>> It's quite hard for us to do this with these machines as we have
>>> them managed by a third party in a datacentre to which we don't have
>>> physical access.  However, I could very easily get an extra 'test'
>>> machine built in there, generate a work load that consistently
>>> reproduces the problems on the six drives, and then retry with an
>>> array build from 5, 4, 3 and 2 drives successively, taking out the
>>> unused drives from chassis, to see if reducing the load on the power
>>> supply with a smaller array helps.
>> Yeap, that also should shed some light on it.
> 
> I have a SuperMicro X8DT3-F motherboard with 2 (2 TB) WDC drives 
> of the 8 bays available in the machine.  They are on a different 
> controller LSI Logic / Symbios Logic SAS1068E PCI-Express Fusion-MPT SAS
> which was flashed into "Integrated Target Mode" to get it running 
> under Linux.
> 
> Disabling smartmontools seems to have helped in terms of failure 
> frequency.  It is almost always the 2nd drive that is kicked out 
> of the mirror although the last time it was the primary after 
> disabling smart.  hddtemp was never running on this host.
> 
> [2256003.055451] end_request: I/O error, dev sdb, sector 3907028974
> [2256003.055674] md: super_written gets error=-5, uptodate=0
> [2256003.055677] raid1: Disk failure on sdb2, disabling device.
> [2256003.055678] raid1: Operation continuing on 1 devices.
> [2256003.437315] RAID1 conf printout:
> [2256003.437318]  --- wd:1 rd:2
> [2256003.437321]  disk 0, wo:0, o:1, dev:sda2
> [2256003.437323]  disk 1, wo:1, o:0, dev:sdb2
> [2256003.440542] RAID1 conf printout:
> [2256003.440545]  --- wd:1 rd:2
> [2256003.440548]  disk 0, wo:0, o:1, dev:sda2
> 
> [3880879.007618] end_request: I/O error, dev sda, sector 3907028974
> [3880879.007839] md: super_written gets error=-5, uptodate=0
> [3880879.007842] raid1: Disk failure on sda2, disabling device.
> [3880879.007843] raid1: Operation continuing on 1 devices.
> [3880879.028518] RAID1 conf printout:
> [3880879.028521]  --- wd:1 rd:2
> [3880879.028524]  disk 0, wo:1, o:0, dev:sda2
> [3880879.028527]  disk 1, wo:0, o:1, dev:sdb2
> [3880879.031607] RAID1 conf printout:
> [3880879.031610]  --- wd:1 rd:2
> [3880879.031613]  disk 1, wo:0, o:1, dev:sdb2
> 
> There is barely any load on this box.  Disabling NCQ did not help 
> for me. 

Can you please post full log?

-- 
tejun
--
To unsubscribe from this list: send the line "unsubscribe linux-ide" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at  http://vger.kernel.org/majordomo-info.html

[Index of Archives]     [Linux Filesystems]     [Linux SCSI]     [Linux RAID]     [Git]     [Kernel Newbies]     [Linux Newbie]     [Security]     [Netfilter]     [Bugtraq]     [Yosemite News]     [MIPS Linux]     [ARM Linux]     [Linux Security]     [Samba]     [Device Mapper]

  Powered by Linux