Re: Implementing low level timeouts within MD

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On Tue, 2007-10-30 at 00:08 -0500, Alberto Alonso wrote:
> On Mon, 2007-10-29 at 13:22 -0400, Doug Ledford wrote:
> 
> > OK, these you don't get to count.  If you run raid over USB...well...you
> > get what you get.  IDE never really was a proper server interface, and
> > SATA is much better, but USB was never anything other than a means to
> > connect simple devices without having to put a card in your PC, it was
> > never intended to be a raid transport.
> 
> I still count them ;-) I guess I just would of hoped for software raid
> to really don't care about the lower layers.

The job of software raid is to help protect your data.  In order to do
that, the raid needs to be run over something that *at least* provides a
minimum level of reliability itself.  The entire USB spec is written
under the assumption that a USB device can disappear at any time and the
stack must accept that (and it can, just trip on a cable some time and
watch your raid device get all pissy).  So, yes, software raid can run
over any block device, but putting it over an unreliable connection
medium is like telling a gladiator that he has to face the lion with no
sword, no shield, and his hands tied behind his back.  He might survive,
but you have so seriously handicapped him that it's all but over.

> > 
> > > * Supermicro MB with ICH5/ICH5R controller and 2 RAID5 arrays of 3 
> > >   disks each. (only one drive on one array went bad)
> > > 
> > > * VIA VT6420 built into the MB with RAID1 across 2 SATA drives.
> > > 
> > > * And the most complex is this week's server with 4 PCI/PCI-X cards.
> > >   But the one that hanged the server was a 4 disk RAID5 array on a
> > >   RocketRAID1540 card.
> > 
> > And 3 SATA failures, right?  I'm assuming the Supermicro is SATA or else
> > it has more PATA ports than I've ever seen.
> > 
> > Was the RocketRAID card in hardware or software raid mode?  It sounds
> > like it could be a combination of both, something like hardware on the
> > card, and software across the different cards or something like that.
> > 
> > What kernels were these under?
> 
> 
> Yes, these 3 were all SATA. The kernels (in the same order as above) 
> are:
> 
> * 2.4.21-4.ELsmp #1 (Basically RHEL v3)

*Really* old kernel.  RHEL3 is in maintenance mode already, and that was
the GA kernel.  It was also the first RHEL release with SATA support.
So, first gen driver on first gen kernel.

> * 2.6.18-4-686 #1 SMP on a Fedora Core release 2
> * 2.6.17.13 (compiled from vanilla sources)
> 
> The RocketRAID was configured for all drives as legacy/normal and
> software RAID5 across all drives. I wasn't using hardware raid on
> the last described system when it crashed.

So, the system that died *just this week* was running 2.6.17.13?  Like I
said in my last email, the SATA stack has been evolving over the last
few years, and that's quite a few revisions behind.  My basic advice is
this: if you are going to use the latest and greatest hardware options,
then you should either make sure you are using an up to date distro
kernel of some sort or you need to watch the kernel update announcements
for fixes related to that hardware and update your kernels/drivers as
appropriate.

-- 
Doug Ledford <dledford@xxxxxxxxxx>
              GPG KeyID: CFBFF194
              http://people.redhat.com/dledford

Infiniband specific RPMs available at
              http://people.redhat.com/dledford/Infiniband

Attachment: signature.asc
Description: This is a digitally signed message part


[Index of Archives]     [Linux RAID Wiki]     [ATA RAID]     [Linux SCSI Target Infrastructure]     [Linux Block]     [Linux IDE]     [Linux SCSI]     [Linux Hams]     [Device Mapper]     [Device Mapper Cryptographics]     [Kernel]     [Linux Admin]     [Linux Net]     [GFS]     [RPM]     [git]     [Yosemite Forum]


  Powered by Linux