Re: System overload / context switching / oom, 8.3

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



Andy Colson wrote:
> work_mem = 32MB
> maintenance_work_mem = 64MB


if you have lots and lots of connections, you might need to cut these down?

definitely, work_mem is the main focus.

If I understand correctly, th 64MB maintenance_work_mem is per vacuum task, and on this system there are 3 autovacuums. I was wondering if with this many databases, possibly decreasing the maintenance_work_mem significantly and starting up more autovacuums.

Yes, also moving databases to other servers in order to decrease the number of connections.

> effective_cache_size = 5000MB

I see your running a 32bit, but with bigmem support, but still, one process is limited to 4gig. You'd make better use of all that ram if you switched to 64bit. And this cache, I think, would be limited to 4gig.
All of the cache is being used because the operating system kernel is built with the memory extensions to access outside the 32bit range. This is the cache size reported by free(1). However, there may be advantages to switch to 64bit.


The oom-killer is kicking in, at some point, so somebody is using too much ram. There should be messages or logs or something, right? (I've never enabled the oom stuff so dont know much about it). But the log messages might be helpful.

Also, do you know what the oom max memory usage is set to?  You said:
"oom_adj -17. vm_overcommit_memory set to 2, but at this time vm_overcommit_ratio was still at 50 (has since been changed to 90, should this be 100?)"

Oh man. I encourage everyone to find out what /proc/<pid>/oom_adj means. You have to set this to keep the Linux "oom-killer" from doing a kill -9 on postgres postmaster. On Debian:

echo -17 >> /proc/$(cat /var/run/postgresql/8.3-main.pid)/oom_adj

This is my experience with oom-killer. After putting -17 into /proc/pid/oom_adj, oom-killer seemed to kill one of the database connection processes. Then the postmaster attempted to shut down all processes because of possible shared memory corruption. The database then went into recovery mode. After stopping the database some of the processes were stuck and could not be killed. The operating system was rebooted and the database returned with no data loss.

My earlier experience with oom-killer: If you don't have this setting in oom_adj, then it seems likely (certain?) that oom-killer kills the postmaster because of the algorithm oom-killer uses (called badness()) which adds children process scores to their parent's scores. I don't know if sshd was killed but I don't think anyone could log in to the OS. After rebooting there was a segmentation violation when trying to start the postmaster. I don't think that running pg_resetxlog with defaults is a good idea. My colleague who has been investigating the crash believes that we could have probably eliminated at least some of the data loss with more judicious use of pg_resetxlog.

There was a discussion on the postgres lists about somehow having the postgres distribution include the functionality to set oom_adj on startup. To my knowledge, that's not in 8.3 so I wrote a script and init.d script to do this on Debian systems.

As far as vm.over_commit memory goes, there are three settings and most recommend setting it to 2 for postgres. However, this does not turn off oom-killer! You need to put -17 in /proc/<pid>/oom_adj whether you do anything about vm.over_commit memory or not We had vm_overcommit_memory set to 2 and oom-killer became active and killed the postmaster.

Kind of off-topic, but a Linux kernel parameter that's often not set on database servers is elevator=deadline which sets up the io scheduling algorithm. The algorithm can be viewed/set at runtime for example the disk /dev/sdc in /sys/block/sdc/queue/scheduler.

Rob




--
Sent via pgsql-performance mailing list (pgsql-performance@xxxxxxxxxxxxxx)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-performance

[Postgresql General]     [Postgresql PHP]     [PHP Users]     [PHP Home]     [PHP on Windows]     [Kernel Newbies]     [PHP Classes]     [PHP Books]     [PHP Databases]     [Yosemite]

  Powered by Linux