Re: heavy swapping, not sure why

Merlin Moncure <mmoncure@xxxxxxxxx> · Tue, 30 Aug 2011 13:17:34 -0500

On Tue, Aug 30, 2011 at 11:17 AM, Lonni J Friedman <netllama@xxxxxxxxx> wrote:
> On Mon, Aug 29, 2011 at 5:42 PM, Tom Lane <tgl@xxxxxxxxxxxxx> wrote:
>> Lonni J Friedman <netllama@xxxxxxxxx> writes:
>>> I have several Linux-x68_64 based dedicated PostgreSQL servers where
>>> I'm experiencing significant swap usage growth over time.  All of them
>>> have fairly substantial amounts of RAM (not including swap), yet the
>>> amount of swap that postgres is using ramps up over time and
>>> eventually hurts performance badly.  In every case, simply restarting
>>> postgresql frees up all the swap in use (until it ramps up again
>>> later).
>>
>> If you're certain that it's restarting *postgres* that does it, and not
>> restarting your application or pgbouncer or some other code, then it
>> seems like you must have uncovered a memory leak someplace.  We haven't
>> got nearly enough info here to diagnose it though.
>>
>> First thing I'd want to know is which process(es) exactly are bloating.
>> The top output you showed us is unhelpful for that since it just shows
>> them all as "postmaster" --- you'll need to match up the problem PIDs
>> with "ps auxww" output.  Keep in mind also that top is pretty awful
>> about distinguishing a process's actual memory use (private memory)
>> from the portion of PG's shared memory that it happens to have touched.
>> What you need to pay attention to is RES minus SHR, not either number
>> alone.  With shared buffers set as high as you've got it, you'll
>> probably not be able to be sure that a process is bloating until it's
>> eaten hundreds of megs of private space.
>
> In the past 18 hours, swap usage has nearly doubled on systemA:
> $ free -m
>             total       used       free     shared    buffers     cached
> Mem:         56481      56210        271          0         11      52470
> -/+ buffers/cache:       3727      52753
> Swap:         1099         35       1064
>
> As a reminder, this is what it was yesterday afternoon (roughly 18
> hours earlier):
>            total       used       free     shared    buffers     cached
> Mem:         56481      55486        995          0         15      53298
> -/+ buffers/cache:       2172      54309
> Swap:         1099         18       1081

This is totally uninteresting.  All this means is that the o/s has
found 35 mb (17 more) of memory sitting around doing nothing and
swapped it out.  That inconsequential amount of extra memory can now
be used by something else, which is expected and desired behavior.

Swap usage, in terms of the amount of memory in swap holding unused
resident memory, is a *good* thing.  Swap churn, that is memory
constantly moving in and out of the swap volume due to the o/s
managing different and competing demands is a bad thing.  FWICT, your
churn is zero -- the vast majority of your memory is allocated to
caching of filesystem pages, which is totally normal.  This is making
me wonder if your original diagnosis of swap causing your surrounding
performance issues is in fact correct.  We need to see a memory and
i/o profile (iostat/iowait) during your 'bad' times to be sure though.

merlin

-- 
Sent via pgsql-general mailing list (pgsql-general@xxxxxxxxxxxxxx)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general