Re: died again

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]



On Mon, Nov 25, 2013 at 11:45 AM, Michael Hennebry
<hennebry@xxxxxxxxxxxxxxxxxxxxx> wrote:
> On Mon, 25 Nov 2013, Mauricio Tavares wrote:
>
>> On Mon, Nov 25, 2013 at 10:25 AM, Michael Hennebry
>> <hennebry@xxxxxxxxxxxxxxxxxxxxx> wrote:
>>> On Sun, 24 Nov 2013, John R Pierce wrote:
>>>
>>>> On 11/24/2013 9:45 PM, Michael Hennebry wrote:
>>>>> CentOS 6.4 died on me again.
>>>>
>>>> only time that has EVER happened to me, on dozens and dozens of systems,
>>>> has been when there's been a serious hardware problem.
>>>
>>> I really do not know whether to hope you are correct.
>>> On one hand a new computer would be expensive.
>>> On the other, if it's something else,
>>> my diagnostic skills are clearly not up to the task.
>>>
>>      Keep an eagle eye on dmesg and the logs. If you can, bring
>> machine down and run memtest86 for a few hours (say, when you go to
>
> I've run the memory test that comes with the Fedora 13 install disk.
> My computer's memory got a clean bill of health.
> To me, neither dmesg nor Xorg.0.log says anything interesting.
>
>> bed or is out partying). Also, *sometimes* the messages log might say
>> something interesting. But I would start with dmesg.
>
> Thank you for the reminder.  It does.
>
> Nov 25 09:47:22 localhost abrtd: Sending an email...
> Nov 25 09:47:22 localhost abrtd: Email was sent to: root@localhost
> Nov 25 09:47:24 localhost abrtd: Duplicate: UUID
> Nov 25 09:47:24 localhost abrtd: DUP_OF_DIR: /var/spool/abrt/ccpp-2013-11-25-09:46:10-7871
> Nov 25 09:47:24 localhost abrtd: Corrupted or bad directory '/var/spool/abrt/ccpp-2013-11-25-09:46:55-8008', deleting
> Nov 25 09:47:26 localhost abrtd: Directory 'ccpp-2013-11-25-09:47:25-8243' creation detected
> Nov 25 09:47:26 localhost abrt[8445]: Saved core dump of pid 8243 (/usr/bin/kdeinit4) to /var/spool/abrt/ccpp-2013-11-25-09:47:25-8243 (78938112 bytes)

      So abrt is having enough issues to spit out a core dump. Since
it watches when other applications crash, it might be worth
investigating that.

> Nov 25 09:47:52 localhost abrtd: Sending an email...
> Nov 25 09:47:52 localhost abrtd: Email was sent to: root@localhost
> Nov 25 09:47:53 localhost abrtd: Duplicate: UUID
> Nov 25 09:47:53 localhost abrtd: DUP_OF_DIR: /var/spool/abrt/ccpp-2013-11-25-09:46:10-7871
> Nov 25 09:47:53 localhost abrtd: Corrupted or bad directory '/var/spool/abrt/ccpp-2013-11-25-09:47:25-8243', deleting
> Nov 25 10:04:58 localhost ntpd[2077]: time reset +0.288044 s
>
> I ran this
> for F in /dev/sd??* ; do ( tune2fs -l $F ; echo $F ) | grep -e dev -e UUID ; done | tee /tmp/tune2fs.txt
> to check for duplicate UUIDs.  I used sort and my eyeballs to check.
> There weren't any.
> The hard drive in use is newer than the motherboard,
> but older than the video card.
> I zapped the first video card installing the new hard drive.
> The second one seemed to die on its own.
>
>> There are some HD tests you can make but honestly I can't pull them
>> off the fuzzy mist that is my head. Hardware or software raid?
>
> No raid.
>
      K. Anything interesting from smartctl? Have you used bonnie++
before? I think if you run it in a window/screen and then keep an eye
on dmesg you might find issues on the HD.

> --
> Michael   hennebry@xxxxxxxxxxxxxxxxxxxxx
> "On Monday, I'm gonna have to tell my kindergarten class,
> whom I teach not to run with scissors,
> that my fiance ran me through with a broadsword."  --  Lily
> _______________________________________________
> CentOS mailing list
> CentOS@xxxxxxxxxx
> http://lists.centos.org/mailman/listinfo/centos
_______________________________________________
CentOS mailing list
CentOS@xxxxxxxxxx
http://lists.centos.org/mailman/listinfo/centos




[Index of Archives]     [CentOS]     [CentOS Announce]     [CentOS Development]     [CentOS ARM Devel]     [CentOS Docs]     [CentOS Virtualization]     [Carrier Grade Linux]     [Linux Media]     [Asterisk]     [DCCP]     [Netdev]     [Xorg]     [Linux USB]
  Powered by Linux