Re: RAID6 data-check took almost 2 hours, clicking sounds, system unresponsive

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



--- On Fri, 15/4/11, Phil Turmel <philip@xxxxxxxxxx> wrote:

> From: Phil Turmel <philip@xxxxxxxxxx>
> Subject: Re: RAID6 data-check took almost 2 hours, clicking sounds, system unresponsive
> To: "Gavin Flower" <gavinflower@xxxxxxxxx>
> Cc: "Mathias Burén" <mathias.buren@xxxxxxxxx>, neilb@xxxxxxx, linux-raid@xxxxxxxxxxxxxxx
> Date: Friday, 15 April, 2011, 10:23
> On 04/14/2011 05:12 PM, Gavin Flower
> wrote:
> 
> > 
> > Hi Phil,
> > 
> > I was under the impression that I had an adequate
> power supply, so I checked all 5 drives.  
[...]
> > 
> > Note that Power_Cycle_Count is anomalous only for
> /dev/sdc, so would this suggest cable problems?
> 
> No two drives are perfectly identical, so when the drive's
> power rail is only slightly overloaded, the least tolerant
> drive chokes as the voltage declines (we're talking tens of
> milliseconds, here).  As soon as it chokes, the extra
> load disappears, and the power supply recovers.  The
> other drives carry on.  The drive that choked resets
> (*Click*) in time for the block driver to try again, and the
> cycle repeats.
> 
> As a test, borrow another power supply and hook just that
> one drive to it.  If the problem continues, the drive
> is toast.  If the problem goes away, look for a better
> power supply.  Note:  for the Barracuda with the
> problem, the detailed spec says the 5V load spikes on
> activity, not the 12V load.  So make sure the current
> capacity of the power supply meets your needs for both 5V
> & 12V (plus your motherboard).  Also check if the
> power supply has multiple regulators for drive power, and if
> you need to re-arrange the connectors to spread the load
> evenly amongst them.
> 
> As another test, you can swap all your cables around. 
> If the problem is in the cables, the problem will follow the
> cables to the drive you moved them to.
> 
> > I am not sure what to make of the other
> discrepancies.
> > 
> > Note that sda, sdb, sdd, & sde were bought and put
> in at the same time, while sdc was only obtained and
> inserted recently.
> 
> So sdc came from a different manufacturing batch, which is
> likely to have slightly different tolerances.
> 
> HTH,
> 
> Phil

Thanks Phil,

A few days ago, I noticed that 2 of my 3 RAID arrays were down to 4 out of 5 drives - /dev/sdc had been dropped out, the one which made clicking sounds when I ran badblocks.

A couple of days ago, my friend Mario brought over his oscilloscope and a volt meter.  The 5 volt rail was showing about 4.7 volts, typically it should be 5.2 - 5.4 (from memory of what he said), and the voltage looked shaky on the oscilloscope.  The old power supply rated at 400 Watts.  

Mario suggested that power supplies greater than 500 Watts had significantly better quality, also he and others said that power supplies tended to have reduced capability to supply their maximum power as they age.  So while, 400 Watts seemed nominally adequate for my system, I started looking for ones that wee at least 500 Watts, I also looked at other features, such as reliability and the ability to support at least 5 sata drives without using adapters.

I was in the process of checking out various power supplies, when my development machine ('saturn') refused to complete the boot process due to RAID problems.

There are many power supplies that would have met my requirements, but I told Mario that I was prepared to pay a bit extra, if there was real benefit, as I saw no point in being penny wise and pound foolish as they say in England.  If the time Mario and I (let alone that of the others who advised me) had spent on this problem was costed, it would have been more than double the price of the power supply, so I figured paying a bit extra was a good investment. The one Mario obtained for me was the one in stock that met my needs without being too expensive.  The new one is 700 Watts with reasonably robust specifications: Cooler Master Extreme Power Plus 700W. MTBF  > 100,000 hours (11 years), high efficiency 80% at typical load...

Reassembling the 2 defective RAID-6 partitions went okay, now all 3 RAID partitions are complete.

Been running over 16 hours now and no apparent problems.  I ran badblocks on all 5 disks concurrently - no clicking sounds were heard, nor were any errors reported. Also the 'ata' errors previously seen on the systems log are absent.

I very much appreciate the help provided to me by the people on this list.


Regards,
Gavin
--
To unsubscribe from this list: send the line "unsubscribe linux-raid" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[Index of Archives]     [Linux RAID Wiki]     [ATA RAID]     [Linux SCSI Target Infrastructure]     [Linux Block]     [Linux IDE]     [Linux SCSI]     [Linux Hams]     [Device Mapper]     [Device Mapper Cryptographics]     [Kernel]     [Linux Admin]     [Linux Net]     [GFS]     [RPM]     [git]     [Yosemite Forum]


  Powered by Linux