Re: how to debug hardware lockups?

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]



on 11-15-2008 11:59 AM Rudi Ahlers spake the following:
> On Sat, Nov 15, 2008 at 8:17 PM, nate <centos-T6AQWPvKiI1cRAk/VAjCeQ@xxxxxxxxxxxxxxxx> wrote:
>> Rudi Ahlers wrote:
>>
>>> Unfortunately, I can't leave a monitor attached to the server all the
>>> time. The server is in a shared cabinet @ a 3rd party ISP, and they
>>> lock the cabinets once we're done working with it. The last lockup was
>>> about 6 days ago, and previous one about 8 days ago. There's no
>>> consitancy.
>>>
>>> How can I redirect all console output to a file instead?
>> Configure a serial console, connect the console to another
>> system and use something like minicom to log the console to a file.
>> You can't really log to the local system in this situation as
>> you likely won't capture the event(if you did you would of
>> seen the error in the system logs)
>>
>> In my experience most of these kinds of problems are related
>> to bad ram.
>>
>> If your running CentOS 4.x configure netdump to send the kernel
>> dumps to another server, if your using CentOS 5.x configure
>> diskdump(?) to store the dump to local disk.
>>
>> Run memtest86 on the system for a few days, replace the system
>> with a known working one so you can take the broken system off
>> site from the ISP for diagnostics.
>>
>> I like running cerberus http://sourceforge.net/projects/va-ctcs/
>> as a burn-in tool, if the system can survive that running for
>> a couple days it should be good. In running against a hundred or
>> so systems I don't recall it taking longer than a few hours
>> to crash the system if there was a problem.
>>
>> nate
>>
>> _______________________________________________
>> CentOS mailing list
>> CentOS@xxxxxxxxxx
>> http://lists.centos.org/mailman/listinfo/centos
>>
> 
> That machine doesn't have a serial port (why do vendors think serial
> ports are obsolete????), so is there any other way to send to logs to
> a different machine then?
> 
Does it have any out of bandwidth management like Dell's drac or HP's ILO?


-- 
MailScanner is like deodorant...
You hope everybody uses it, and
you notice quickly if they don't!!!!

Attachment: signature.asc
Description: OpenPGP digital signature

_______________________________________________
CentOS mailing list
CentOS@xxxxxxxxxx
http://lists.centos.org/mailman/listinfo/centos

[Index of Archives]     [CentOS]     [CentOS Announce]     [CentOS Development]     [CentOS ARM Devel]     [CentOS Docs]     [CentOS Virtualization]     [Carrier Grade Linux]     [Linux Media]     [Asterisk]     [DCCP]     [Netdev]     [Xorg]     [Linux USB]
  Powered by Linux