On 8/17/09 4:43 PM, "Scott Carey" <scott@xxxxxxxxxxxxxxxxx> wrote: > > > On 8/17/09 10:24 AM, "Jeremy Carroll" <jeremy.carroll@xxxxxxxxxxxxxxxxxxxxx> > wrote: > >> I believe this is exactly what is happening. I see that the TOP output lists >> a >> large amount ov VIRT & RES size being used, but the kernel does not report >> this memory as being reserved and instead lists it as free memory or cached. > > Oh! I recall I found that fun behaviour Linux and thought it was a Postgres > bug a while back. It has lot of other bad effects on how the kernel chooses > to swap. I really should have recalled that one. Due to this behavior, I > had initially blamed postgres for "pinning" memory in shared_buffers in the > disk cache. But that symptom is one of linux thinking somehow that pages > read into shared memory are still cached (or something similar). > > Basically, it thinks that there is more free memory than there is when there > is a lot of shared memory. Run a postgres instance with over 50% memory > assigned to shared_buffers and when memory pressure builds kswapd will go > NUTS in CPU use, apparently confused. With high OS 'swappiness' value it > will swap in and out too much, and with low 'swappiness' it will CPU spin, > aware on one hand that it is low on memory, but confused by the large > apparent amount free so it doesn't free up much and kswapd chews up all the > CPU and the system almost hangs. It behaves as if the logic that determines > where to get memory from for a process knows that its almost out, but the > logic that decides what to swap out thinks that there is plenty free. The > larger the ratio of shared memory to total memory in the system, the higher > the CPU use by the kernel when managing the buffer cache. > Based on a little digging, I'd say that this patch to the kernel probably alleviates the performance problems I've seen with swapping when shared mem is high: http://lwn.net/Articles/286472/ Other patches have improved the shared memory tracking, but its not clear if tools like top have taken advantage of the new info available in /proc. -- Sent via pgsql-performance mailing list (pgsql-performance@xxxxxxxxxxxxxx) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-performance