Hi,
I've facing an out of memory condition after running SLONY several hours to get a 1TB database with about 23,000 tables replicated. The error occurs after about 50% of the tables were replicated.
Most of the 48GB memory is being used for file system cache but for some reason the initial copy of one table performed by SLONY abended due to an out of memory condition. The table that was being transferred at the moment of the abend has two text columns.
After the OOM condition is raised, "select *" of that specific table also returns out of memory condition.
I guess postgresql is trying to perform an atomic allocation (those which cannot wait for reclaim) to get a continues memory area and is failing due to memory fragmentation.
My idea to prevent this issue is to reserve 500MB of free storage for atomic allocations using vm.min_free_kbytes = 500000000 in the /etc/sysctl.conf.
Is this a good approach to solve it?
Another question: is it safe to flush file system cache using these steps:
1) Shutdown postgresql
2) sync
3) echo 1 > /proc/sys/vm/drop_caches;
4) Startup postgresql
Some data about the issue:
SLONY error:
2014-12-01 12:14:56 BRST ERROR remoteWorkerThread_1: copy to stdout on provider - PGRES_FATAL_ERROR ERROR: out of memory
DETAIL: Failed on request of size 123410655.
# cat /etc/redhat-release
Red Hat Enterprise Linux Server release 6.3 (Santiago)
# uname -m
x86_64
# free
total used free shared buffers cached
Mem: 49422076 49038348 383728 0 268488 47520476
-/+ buffers/cache: 1249384 48172692
Swap: 16777208 0 16777208
# cat /proc/meminfo | grep Commit
CommitLimit: 41488244 kB
Committed_AS: 689312 kB
# /sbin/sysctl vm.min_free_kbytes
vm.min_free_kbytes = 135168
After SLONY gets the out of memory condition, select * of the table also does not work:
FiscalWeb=# select * from "8147_spunico"."sincdc";
ERROR: out of memory
DETAIL: Failed on request of size 268435456.
Backup of the table using pg_dump also gives out of memory condition.
Buddyinfo indicates memory fragmentation after getting out of memory condition:
# cat /proc/buddyinfo
Node 0, zone DMA 3 2 2 3 2 1 1 0 1 0 3
Node 0, zone DMA32 94091 69426 30367 7531 996 126 8 0 0 1 0
Node 0, zone Normal 6840 23 0 0 0 0 0 0 0 0 1
Node 1, zone Normal 730 338 159 93 44 26 11 9 3 1 3
Node 2, zone Normal 68 535 309 144 60 18 13 12 32 29 7
Node 3, zone Normal 319246 341233 173115 52602 5989 646 232 63 8 3 1
postgres=# select version();
version
-----------------------------------------------------------------------------------------------------------
PostgreSQL 8.3.21 on x86_64-unknown-linux-gnu, compiled by GCC gcc (GCC) 4.4.6 20120305 (Red Hat 4.4.6-4)
(1 row)
Thank you!
Carlos Reimer
I've facing an out of memory condition after running SLONY several hours to get a 1TB database with about 23,000 tables replicated. The error occurs after about 50% of the tables were replicated.
Most of the 48GB memory is being used for file system cache but for some reason the initial copy of one table performed by SLONY abended due to an out of memory condition. The table that was being transferred at the moment of the abend has two text columns.
After the OOM condition is raised, "select *" of that specific table also returns out of memory condition.
I guess postgresql is trying to perform an atomic allocation (those which cannot wait for reclaim) to get a continues memory area and is failing due to memory fragmentation.
My idea to prevent this issue is to reserve 500MB of free storage for atomic allocations using vm.min_free_kbytes = 500000000 in the /etc/sysctl.conf.
Is this a good approach to solve it?
Another question: is it safe to flush file system cache using these steps:
1) Shutdown postgresql
2) sync
3) echo 1 > /proc/sys/vm/drop_caches;
4) Startup postgresql
Some data about the issue:
SLONY error:
2014-12-01 12:14:56 BRST ERROR remoteWorkerThread_1: copy to stdout on provider - PGRES_FATAL_ERROR ERROR: out of memory
DETAIL: Failed on request of size 123410655.
# cat /etc/redhat-release
Red Hat Enterprise Linux Server release 6.3 (Santiago)
# uname -m
x86_64
# free
total used free shared buffers cached
Mem: 49422076 49038348 383728 0 268488 47520476
-/+ buffers/cache: 1249384 48172692
Swap: 16777208 0 16777208
# cat /proc/meminfo | grep Commit
CommitLimit: 41488244 kB
Committed_AS: 689312 kB
# /sbin/sysctl vm.min_free_kbytes
vm.min_free_kbytes = 135168
After SLONY gets the out of memory condition, select * of the table also does not work:
FiscalWeb=# select * from "8147_spunico"."sincdc";
ERROR: out of memory
DETAIL: Failed on request of size 268435456.
Backup of the table using pg_dump also gives out of memory condition.
Buddyinfo indicates memory fragmentation after getting out of memory condition:
# cat /proc/buddyinfo
Node 0, zone DMA 3 2 2 3 2 1 1 0 1 0 3
Node 0, zone DMA32 94091 69426 30367 7531 996 126 8 0 0 1 0
Node 0, zone Normal 6840 23 0 0 0 0 0 0 0 0 1
Node 1, zone Normal 730 338 159 93 44 26 11 9 3 1 3
Node 2, zone Normal 68 535 309 144 60 18 13 12 32 29 7
Node 3, zone Normal 319246 341233 173115 52602 5989 646 232 63 8 3 1
postgres=# select version();
version
-----------------------------------------------------------------------------------------------------------
PostgreSQL 8.3.21 on x86_64-unknown-linux-gnu, compiled by GCC gcc (GCC) 4.4.6 20120305 (Red Hat 4.4.6-4)
(1 row)
Thank you!
Carlos Reimer