Re: RESOLVED: Explained by known hardware failures, or keep looking?

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



Thanks, all.  Just an FYI to wrap up the thread.

>>> On Mon, Jun 18, 2007 at  3:25 PM, in message <4713.1182198324@xxxxxxxxxxxxx>,
Tom Lane <tgl@xxxxxxxxxxxxx> wrote: 
> "Kevin Grittner" <Kevin.Grittner@xxxxxxxxxxxx> writes:
>> I'm suspicious that either the controller
>> didn't persist dirty pages in the June 14th failure
> 
> That's what it looks like to me --- it's hard to tell if the hardware or
> the filesystem is at fault, but one way or another some pages that were
> supposedly securely down on disk were wiped to zeroes.  You should
> probably review whether the hardware is correctly reporting write-complete.
 
The hardware tech found many problems with this box.  I may just give it
a heavy update load and pull both plugs to see if it comes up clean now.
 
The following was done:
 
Replaced 2 failed drives
Controller firmware updated
SCSI micro code updated
Performed Yast Online updates
Connected second power supply
 
Our newer boxes have monitoring software which alerts us before a box
gets into this bad a state.
 
-Kevin
 




[Index of Archives]     [KVM ARM]     [KVM ia64]     [KVM ppc]     [Virtualization Tools]     [Spice Development]     [Libvirt]     [Libvirt Users]     [Linux USB Devel]     [Linux Audio Users]     [Yosemite Questions]     [Linux Kernel]     [Linux SCSI]     [XFree86]

  Powered by Linux