I've been struggling with a webserver which crashes every few days with the dreaded "out of memory" problem. I've been trying to correlate it to spikes in traffic, but that doesn't seem to be the case. I was quite excited last time when the crash co-incided with a visit by several search engines, including being hammered by the Baiduspider, but the most recent crash was in a relatively quiet traffic period, so I'm once again scratching my head. I thought, therefore that I'd consult the experts ... The thing which I'm most confused about is that the server isn't particularly busy. The server seems to die when we get a burst of over 400 page requests in an hour, which doesn't seem like it should really be taxing a server of these specs: SPECS ========== First the hardware spec. Its a real (as opposed to virtual) server with 2Gb of RAM, and 4Gb swap. It's got an Intel(R) Pentium(R) Dual CPU E2160 @ 1.80GHz, and a Western Digital 160Gb ATA disk. As for software, here's a quick summary. Running on Centos 5.2: - Apache/2.2.3 using prefork - PHP 5.1.6 (cli) - mysql Ver 14.12 Distrib 5.0.45 - Joomla 1.5.7 (latest version) - Wordpress 2.x (latest version) CONFIG SETTINGS ================ The relevant sections from my Apache config. KeepAlive On MaxKeepAliveRequests 100 KeepAliveTimeout 5 <IfModule prefork.c> StartServers 2 MinSpareServers 2 MaxSpareServers 4 ServerLimit 256 MaxClients 256 MaxRequestsPerChild 4000 </IfModule> In php.ini I have the following memory-related settings. safe_mode = Off max_execution_time = 30 max_input_time = 60 memory_limit = 100M log_errors = on report_memleaks = On error_log = /var/log/php_error.log post_max_size = 8M DIAGNOSTICS =============== I wrote a script to capture various bits of memory information about the server and set it to go off once every 15 minutes. Of course once the server freezes up, the frequency that these cron jobs run at drops, but last crash I managed to get one just before and one just after the first Out Of Memory report. Let me know if you're interested in a copy of the script. There are no relevant error messages in the PHP error log or MySQL error log. Before crash. ============= =========================== SUMMARY ============================ Tue Dec 23 19:15:01 HKT 2008 =========================== uptime ============================== 19:15:01 up 3 days, 11:34, 0 users, load average: 5.11, 2.31, 1.00 ========================== free -m ============================== total used free shared buffers cached Mem: 2001 879 1121 0 19 351 -/+ buffers/cache: 508 1493 Swap: 4094 0 4094 ========================= vmstat 1.5 ============================ procs -----------memory---------- ---swap-- -----io---- --system-- -----cpu------ r b swpd free buff cache si so bi bo in cs us sy id wa st 3 1 124 1148504 19920 360436 0 0 9 30 13 35 10 1 89 0 0 0 0 124 1224564 19924 360516 0 0 0 4 1029 249 8 1 91 1 0 0 0 124 1224572 19924 360516 0 0 0 0 1021 226 0 0 100 0 0 0 0 124 1224588 19924 360548 0 0 0 0 1017 229 0 0 100 0 0 1 0 124 1286712 19932 360540 0 0 0 236 1023 436 6 0 94 0 0 ================== ps top 20 Processes by CPU =================== USER %MEM %CPU PID CMD apache 2.0 18.7 14356 /usr/sbin/httpd apache 2.0 17.3 14357 /usr/sbin/httpd apache 3.2 13.8 14340 /usr/sbin/httpd apache 2.0 13.5 14325 /usr/sbin/httpd apache 3.2 10.8 14330 /usr/sbin/httpd mysql 2.3 1.1 2569 /usr/libexec/mysqld --basedir=/usr --datadir=/var/lib/mysql --user=mysql --pid-file=/var/run/mysqld/mysqld.pid --skip-external-locking --socket=/var/lib/mysql/mysql.sock root 0.5 0.0 12267 /usr/sbin/httpd root 0.0 0.0 1580 [kjournald] root 0.0 0.0 227 [kswapd0] root 0.0 0.0 226 [pdflush] root 0.0 0.0 2326 pcscd ntp 0.2 0.0 2477 ntpd -u ntp:ntp -p /var/run/ntpd.pid -g root 0.7 0.0 2787 /usr/bin/python -tt /usr/sbin/yum-updatesd root 0.0 0.0 1 init [3] 68 0.1 0.0 2767 hald root 0.0 0.0 2097 auditd root 0.0 0.0 225 [pdflush] root 0.0 0.0 422 [kjournald] root 0.0 0.0 483 /sbin/udevd -d After crash. ============= =========================== SUMMARY ============================ Tue Dec 23 19:46:32 HKT 2008 =========================== uptime ============================== 19:46:34 up 3 days, 12:05, 0 users, load average: 111.60, 108.31, 85.79 ========================== free -m ============================== total used free shared buffers cached Mem: 2001 1987 14 0 1 20 -/+ buffers/cache: 1964 36 Swap: 4094 4067 27 ========================= vmstat 1.5 ============================ procs -----------memory---------- ---swap-- -----io---- --system-- -----cpu------ r b swpd free buff cache si so bi bo in cs us sy id wa st 0 113 4164604 9348 1264 22412 6 11 16 41 16 37 10 1 88 1 0 0 116 4164196 7876 1264 22712 1240 304 1492 304 1169 623 4 1 0 95 0 0 116 4165224 8464 1268 22708 320 1144 324 1148 1300 410 0 2 0 99 0 0 116 4172532 8252 1280 22820 980 7512 1192 7556 1232 442 0 7 0 93 0 0 116 4176940 11448 1296 22888 1388 4716 1456 4720 1178 452 1 5 0 95 0 ================== ps top 20 Processes by CPU =================== USER %MEM %CPU PID CMD root 0.0 4.0 15290 ps -eo user,%mem,%cpu,pid,cmd --sort -%cpu apache 0.6 1.2 14340 /usr/sbin/httpd mysql 1.7 1.1 2569 /usr/libexec/mysqld --basedir=/usr --datadir=/var/lib/mysql --user=mysql --pid-file=/var/run/mysqld/mysqld.pid --skip-external-locking --socket=/var/lib/mysql/mysql.sock apache 0.5 0.9 15283 /usr/sbin/httpd apache 0.6 0.9 14330 /usr/sbin/httpd apache 0.5 0.8 14356 /usr/sbin/httpd apache 0.9 0.6 14390 /usr/sbin/httpd apache 0.5 0.3 14406 /usr/sbin/httpd apache 0.5 0.2 14395 /usr/sbin/httpd apache 0.6 0.2 14464 /usr/sbin/httpd apache 0.6 0.2 14437 /usr/sbin/httpd apache 0.6 0.2 14489 /usr/sbin/httpd apache 0.6 0.2 14422 /usr/sbin/httpd apache 0.6 0.2 14454 /usr/sbin/httpd apache 0.5 0.2 14473 /usr/sbin/httpd apache 0.6 0.2 14510 /usr/sbin/httpd apache 0.5 0.2 15253 /usr/sbin/httpd apache 0.5 0.2 14498 /usr/sbin/httpd apache 0.5 0.2 14467 /usr/sbin/httpd Discussion ================ OK, if you're still with me, thanks for getting this far. So before the Out of Memory, the CPU load is around 70% and the load average is high, but not critical. After the Out of Memory, the entre Swap is full, the load average is insane, and the disk is swapping like crazy. There also seem to be a lot of httpd processes spawned, but not really doing much. At this point the server is inaccessible. Over the next hour or two the swap never really empties, and only returns to normal after a reboot. I've been trying a few things by changing the Apache settings according to various suggestions around the internet, but I'm really stabbing in the dark here. Is there anything anyone can see which is obviously wrong about my configuration. I'm on the point of just slapping in some more RAM, and forgetting about it, but I'd really like to understand it first. I've also tried enabling the server-status page, but unless I'm watching it when it goes south, it hasn't been much help. So. Any advice? JM --------------------------------------------------------------------- The official User-To-User support forum of the Apache HTTP Server Project. See <URL:http://httpd.apache.org/userslist.html> for more info. To unsubscribe, e-mail: users-unsubscribe@xxxxxxxxxxxxxxxx " from the digest: users-digest-unsubscribe@xxxxxxxxxxxxxxxx For additional commands, e-mail: users-help@xxxxxxxxxxxxxxxx