Be advised that having the case open will often degrade, rather than
improve, the cooling. A good system designer will ensure that the air
flows through the system with the appropriate amount of eddy current,
laminar flow, etc. However, with the cover off, the air will often pool
in unexpected places, not flow properly, etc. (As an example, some of the
IBM computers around here are explicitly labeled (paraphrasing from
memory): "For proper cooling, ensure that the cover is removed for no
more than 5 minutes while the system is running".)
Steve Friedman
On Thu, 17 Mar 2005, Ivan Gyurdiev wrote:
(discussion was moved to fedora-test)
On Thu, 2005-03-17 at 23:13 +0100, Thomas Hille wrote:
Am Donnerstag, den 17.03.2005, 11:29 -0500 schrieb Ivan Gyurdiev:
Crash 3: Random freeze in the middle of what I'm doing. Screen freezes
to a standstill. System responds to sysrq. Sysrq-p shows X to be active.
I can provide more info if you'd like.
This occurs with the nvidia binary driver.
I had the exact same behavior, I first thought it was some update, but I
finally resolved it being the chipset getting too hot. Installing an
additional fan resolved it. - And believe me, I _NEVER_ would have
thought the chipset could overheat 5 minutes after boot and the other
time it worked for days without a reboot. Especially while my board, a
tyan thunder, was said to be extremely stable. (well now it is: uptime
15 days, no reboot in sight)
Okay. Well, it now seems like a hardware problem, because I just ran
memtest on it, and halfway throughout the memtest the screen blanked
(standby). I'm not sure how memtest works, but I think it's not supposed
to do that. That implies it's a hardware problem.
So, I replaced one memory chip, and it appeared to work. Gnome-terminal
no longer crashed it, and I could play unreal... I was just getting
pretty excited about finding the problem, and then it crashed again in
unreal (nvidia.ko driver, obviously) (random garbage on screen).
Replacing the other memory chip makes it crash too. So, it's a hardware
problem; it's not the memory, though removing one memory chip gets rid
of my problems with gnome-terminal (why?).
It could be overheating, even though I have some ridiculous number of
fans in, and the case was open when it last crashed. Maybe the video
card's just broken - I've gotten it to come slightly out of the PCI
slot, and reboot the computer a few times before, which can't be good
for the card (it's not my fault - I can't screw the card in, because of
Thermaltake's flawed design).
How to proceed...
I am back on the nv.ko driver, I turned on the RENDER extension, since
it seems to make no difference. I've removed my newer memory chip, which
I just got recently. I will see if I can get it to crash with the nv
driver and my old memory setup. I'll also do a memtest, hoping to get it
to completion on each memory chip. I'll take my video card w/ me over
Spring Break to see if I can get it to crash on a Windows system.
Other recommendations?
---------------------
I have another question. This has bothered me for a very long time, but
I've never reported it to LKML, since I assumed it was hardware
problems. I have wireless Logitech peripherals (both kb and mouse).
They're controlled by a usb receiver. After a forced reboot, they fail
to work every time. I can't even type my on-boot password at the boot
screen. If I turn off the computer, they still don't work after turning
it back on. I need to unplug the power cord, wait 5 secs, and plug in
back in, and it's fixed. Any idea why that would happen? It's so
annoying...