Re: Idle backends outside a transaction holding onto large amounts of memory / swap space?

Pavel Stehule <pavel.stehule@xxxxxxxxx> · Mon, 25 Feb 2019 11:52:10 +0100

Hi

po 25. 2. 2019 v 11:37 odesílatel Tobias Gierke <tobias.gierke@xxxxxxxxxxxxxxxx> napsal:
Hi,

Recently we started seeing the Linux OOM killer kicking in and killing 

PostgreSQL processes on one of our development machines.

The PostgreSQL version we're using was compiled by us, is running on 

CentOS 7 and is

PostgreSQL 10.5 on x86_64-pc-linux-gnu, compiled by gcc (GCC) 4.8.5 

20150623 (Red Hat 4.8.5-28), 64-bit

While looking at the machine I saw the following peculiar thing: Swap is 

almost completely full while buff/cache still has ~3GB available.

root@demo:/etc/systemd/system # free -m

               total        used        free      shared buff/cache   

available

Mem:           7820        3932         770        1917 3116        1548

Swap:          4095        3627         468

Running the following one-liner shows that two PostgreSQL processes are 

using most of the swap:

for proc in /proc/*;   do echo $proc ; cat $proc/smaps 2>/dev/null | awk 

'/Swap/{swap+=$2}END{print swap "\tKB\t'`echo $proc|awk '{print $1}' `'" 

}'; done | sort -n | awk '{total+=$1}/[0-9]/;END{print total "\tKB\tTotal"}'

1387496 KB      /proc/22788

1837872 KB      /proc/22789

I attached the memory mappings of these processes to the mail. Both 

processes inside PostgreSQL show up as idle outside of any transaction 

and belong to a JDBC (Java) connection pool.

Is good to close sessions after some times (once per hour) because allocated memory is released to operation system when process is closed. Without it, the operation memory can be fragmented.

if run some big queries then some memory can be assigned to process, and is not released.

Regards

Pavel

voip=# select * from pg_stat_activity where pid in (22788,22789);

-[ RECORD 1 ]----+------------------------------

datid            | 16404

pid              | 22789

usesysid         | 10

usename          | postgres

client_addr      | 127.0.0.1

client_hostname  |

client_port      | 45649

backend_start    | 2019-02-25 00:17:15.246625+01

xact_start       |

query_start      | 2019-02-25 10:52:07.729096+01

state_change     | 2019-02-25 10:52:07.748077+01

wait_event_type  | Client

wait_event       | ClientRead

state            | idle

backend_xid      |

backend_xmin     |

query            | COMMIT

backend_type     | client backend

-[ RECORD 2 ]----+------------------------------

datid            | 16404

pid              | 22788

usesysid         | 10

usename          | postgres

client_addr      | 127.0.0.1

client_hostname  |

client_port      | 45648

backend_start    | 2019-02-25 00:17:15.24631+01

xact_start       |

query_start      | 2019-02-25 10:55:42.577158+01

state_change     | 2019-02-25 10:55:42.577218+01

wait_event_type  | Client

wait_event       | ClientRead

state            | idle

backend_xid      |

backend_xmin     |

query            | ROLLBACK

backend_type     | client backend

--------->8------------------>8------------------>8------------------>8---------

I attached the postgresql.conf we're using to this mail as well.

Is this expected behaviour ? Did we over-commission the machine in our 

postgresql.conf ?

Thanks,

Tobias