Matthew Tice wrote:
Currently we're migrating our static node cluster from 32bit OpenSuse 10.3 using the disk_cache_module on a 2G tmpfs to a 64bit CentOS 5.3 using the disk_cache module on a 9G tmpfs. After pushing these CentOS nodes into production (and consequently adding many more requests) we started seeing a load spike on these systems. Preliminary tests have shown that using a 2G (maybe 3G - still testing that one) tmpfs on the same CentOS node doesn't have the same high load. I'm not sure if this is a bug with tmpfs, Apache/disk_cache, CentOS, or what. Any insight into this strange problem would be appreciated.
I had this problem on my server where the system service "mlocate" was scheduled to run every day. It basically scans every file on the system, and with the huge numbers of files generated by disk_cache, it took more than a day to finish one scan. So the next day, there were two running mlocate instances. Then three. Then no legitimate IO requests were being serviced and the whole server ground to a halt. The load average skyrocketed because of all the waiting processes. "mlocate" didn't show up on 'top' because it used almost no CPU time. I diagnosed the problem with 'iotop' - it gives per-process IO stats.
This is probably not the same problem you're having, but iotop is still a useful tool to identify IO competition when you can't find the culprit based on CPU-time.
Cheers, Nicholas Sherlock --------------------------------------------------------------------- The official User-To-User support forum of the Apache HTTP Server Project. See <URL:http://httpd.apache.org/userslist.html> for more info. To unsubscribe, e-mail: users-unsubscribe@xxxxxxxxxxxxxxxx " from the digest: users-digest-unsubscribe@xxxxxxxxxxxxxxxx For additional commands, e-mail: users-help@xxxxxxxxxxxxxxxx