The machines are both Pentium-4's, around 6 years old, 2Gb memory and IDE disks. Runlevel is 6 (full GUI). No obvious errors are showing up in /var/log/messages or in dmesg. I have three machines in another room that are the same model, upgraded the same way, that seem to be happy. Those machines are only used remotely and not generally from their consoles. "top" says that nothing is going on although the load average is 3+. "sar" also says that nothing is going on. Yesterday I turned off the BIOS power management (so no disks spin down and no monitors turn off and such). I also changed the /etc/ntp.conf to more closely match the latest ".rpmnew" version from Red Hat. It references our company's itnernal ntp servers but otherwise it is out-of-the-box from Red Hat. And I thought that maybe I had found something in doing that. The machines ran fine all the rest of the day. Then at about 7:18pm last night both machines essentially stopped working. I had "sar" running on the one machine dumping data every 30 seconds. According to the sar output, at about 7:01pm last night that machine essentially stopped having any work to do. Disk activity went to near zero, the machine went to 99.99% idle, there was network activity every once in a while, the occassion hint of I/O activity but nothing else. The user on that machine was logged in remotely and they said that at about 7:15pm last night the connection suddenly got so slow that they couldn't work any more. The other machine was not actively in use at the time although the user was logged in and the screen locked. This morning coming in, both machines still thought it was about 7:18pm. The "date" was the day before at 7:18pm (give or take depending on which machine) and the /usr/sbin/hwclock was correct about the actual time. As a separate problem, last evening I discovered that NFS mount points being exported from all of the RHELv4 machines can be mounted by Solaris v6, v7, v8, and v9 machines but Solaris v10 machines, both Sparc and X86 based, are unable to mount the RHELv4 mount points. And there are no errors in /var/log/messages. HP, SGI, and AIX machines can mount those points, but not Solaris 10. I may have "fixed" the RHELv5 version of the problem. I had noticed that netstat was reporting around 2200 TIME_WAIT sockets, nearly all for the NIS or DNS servers. I find that by setting the systcl tcp_tw_reuse flag to 1 (default is 0 on RHELv3, RHELv4, and RHELv5) that the number of sockets in TIME_WAIT drops to what I see on other machines *and* the RHELv5 machine no longer develops its version of the slowdown problem. If I can ever get one of the problem RHELv4 machines to run a netstat while the slowdown effect is in effect I'll have to see if something similar helps there. It is hard to get their "attention" when the slowdown effect is going on. It can be done but have coffee handy. Gary > ------------------------------------------------------------ > From: "Marti, Robert" <RJM002@xxxxxxxx> > > What kind of disks are you using? I'd tend to look at IO issues with that kind of description. > ------------------------------------------------------------ > From: m.roth@xxxxxxxxx > > What runlevel? Any clues in /var/log/messages? Or dmesg? > ------------------------------------------------------------ > From: "Mr. Paul M. Whitney" <paul.whitney@xxxxxx> > > Do you have any other software installed? anti-virus? other third-party > software? It could be a memory leak. > ------------------------------------------------------------ > From: Kenneth Kirchner <ken@xxxxxxxxxxxxx> > > I would recommend installing some kind of performance monitoring software > like Nagios or just SNMPd and Cacti. These will track the performance of > your machine and let you see memory, cpu, disk I/O, processes, etc over time > to help identify what is going on. These arent system intensive and give > you much better visibility when problems like this do occur. There are many > benefits. > ------------------------------------------------------------ > From: James Jones <jrjones@xxxxxxxxxx> > > I agree with Ken, you need to figure out what is eating your lunch, as they > say. You have several programs such as top, and system monitor that can > give you some high level look into what is going on. Also, how much memory > is on the machines and what type of cpus are installed. > > If you can provide some info either from top or system monitor that may help > in providing some additional assistance also. > ------------------------------------------------------------ > From: "Geofrey Rainey" <Geofrey.Rainey@xxxxxxxxxx> > > I don't know if anyone has said this yet, but have you installed the > sysstat package and used the "sar" utility? Perhaps you've got I/O > issues which sar will reveal. > ------------------------------------------------------------ > [mailto:redhat-list-bounces@xxxxxxxxxx] On Behalf Of Kenneth Kirchner > > I would recommend installing some kind of performance monitoring > software like Nagios or just SNMPd and Cacti. These will track the > performance of your machine and let you see memory, cpu, disk I/O, > processes, etc over time to help identify what is going on. These arent > system intensive and give you much better visibility when problems like > this do occur. There are many benefits. > ------------------------------------------------------------ > On Oct 4, 2010, at 10:58 AM, Gary E Barnes wrote: > > > The past week I upgraded our RHELv3 machines to v4. Previously we had > > several v3's, one v4, and one v5. The v5 has never worked. Now the new > > v4's are acting up. > > > > Boot the machine, things are fine. Wait overnight and the machine may > > take ten minutes to unlock the screen, may take several 10's of seconds to > > do an ls, and generally simply isn't usable. > > > > The v4's if you reboot them seem to be fine for the day. > > The v5 if you reboot it is fine for maybe 15 minutes. > > > > The v4's, there will be a load average of 3 to 4, but top says nothing > > whatsoever (other than top and the xterm) is running. > > The v5, there will be a load average of 0.1 or less and top again says > > nothing is running. > > > > SELinux is turned off. Firewall is turned off. I've even tried turning > > off every service that isn't vital to being able to simply boot the > > machines. > > > > Any ideas? > > > > Gary -- redhat-list mailing list unsubscribe mailto:redhat-list-request@xxxxxxxxxx?subject=unsubscribe https://www.redhat.com/mailman/listinfo/redhat-list