On Wed, 2011-04-13 at 13:06 -0700, Florin Andrei wrote: > Running v5 64bit on a Dell 1950. > > A cluster of 3 DB machines, identical hardware. One of them suddenly > became slower 2 weeks ago. > > tar -zxf with a large file on this machine takes 1.5 minutes, but takes > only 10 seconds on any of its siblings. CPU usage seems high while > untarring, with lots of user and sys cycles being used, but almost no > wait cycles. It doesn't matter whether I untar on a local disk, or on a > fiber channel SAN volume, it's slow anyway. > > scp a file over the network is slow too: 6 MB/s to this machine, 70 MB/s > to its siblings. > > However, this is just as fast on all systems, including the "sick" one: > > # time dd if=/dev/zero of=/dev/null bs=1M count=100000 > 100000+0 records in > 100000+0 records out > 104857600000 bytes (105 GB) copied, 2.59213 seconds, 40.5 GB/s > > real 0m2.600s > user 0m0.025s > sys 0m2.550s > > /proc/cpuinfo looks fine. Nothing suspect in dmesg. > > Reboot doesn't fix it. Power off / power on doesn't fix it. Single mode > is slow too, and I tried a couple different kernels. > > Dell's online diagnostics program could find nothing wrong with it. > > /var/log/messages was full of "ntpd[7313]: frequency error -1707 PPM > exceeds tolerance 500 PPM" messages. There was a lot of messages about > "the system limit for the maximum number of semaphore sets has been > exceeded"; there was indeed a lot of leftover semaphores created by NRPE > (owned by the nagios user); I deleted them but nothing has changed, so > they were a symptom, not the cause. Are the system times different between the siblings? Are all 3 siblings running ntpd and using the same time source (server(s))? Do the symptoms change with ntpd stopped/running? Are the frequency offsets the same on each sibling? Since your log messages appear to be ntp related, you might try resetting your frequency offset and drift values. Having a -1707 PPM offset could cause many issues like you describe. service ntpd stop ntptime -f 0 echo "0" > /var/lib/ntp/drift service ntpd start > I'm still kind of hoping it's a software issue, but chances are slim. > OTOH, I can't imagine any hardware problem that would exhibit these > symptoms. > > Any idea what to test? > _______________________________________________ CentOS mailing list CentOS@xxxxxxxxxx http://lists.centos.org/mailman/listinfo/centos