On Wed, Aug 04, 2010 at 05:58:12PM +0800, Chris Webb wrote: > Wu Fengguang <fengguang.wu@xxxxxxxxx> writes: > > > This is interesting. Why is it waiting for 1m here? Are there high CPU > > loads? Would you do a > > > > echo t > /proc/sysrq-trigger > > > > and show us the dmesg? > > Annoyingly, magic-sysrq isn't compiled in on these kernels. Is there another > way I can get this info for you? Replacing the kernels on the machines is a > painful job as I have to give the clients running on them quite a bit of > notice of the reboot, and I haven't been able to reproduce the problem on a > test machine. Maybe turn off KSM? It helps to isolate problems. It's a relative new and complex feature after all. > I also think the swap use is much better following a reboot, and only starts > to spiral out of control after the machines have been running for a week or > so. Something deteriorates over long time.. It may take time to catch this bug.. > However, your suggestion is right that the CPU loads on these machines are > typically quite high. The large number of kvm virtual machines they run mean > thatl oads of eight or even sixteen in /proc/loadavg are not unusual, and > these are higher when there's swap than after it has been removed. I assume > this is mostly because of increased IO wait, as this number increases > significantly in top. iowait = CPU (idle) waiting for disk IO So iowait means not CPU load, but somehow disk load :) Thanks, Fengguang -- To unsubscribe, send a message with 'unsubscribe linux-mm' in the body to majordomo@xxxxxxxxxx For more info on Linux MM, see: http://www.linux-mm.org/ . Don't email: <a href=mailto:"dont@xxxxxxxxx"> email@xxxxxxxxx </a>