Hi - I'm running into an OOM-killer issue when running a specific query (no virtual machine running) and, based on researching the issue, I can probably fix by making the following sysctl adjustments:
vm.overcommit_memory = 2
vm.overcommit_ratio = 0
However, I am perplexed as to why I am running into the issue in the first place. The machine (running Linux 2.6.34.7-61.fc13.x86_64) is dedicated to Postgres (v9.0.0 [RPM package: postgresql90-9.0.0-1PGDG.fc13.1.x86_64]) and the following memory usage is pretty typical for the system (via "top"):
Mem: 8121992k total, 2901960k used, 5220032k free, 237408k buffers
Swap: 1048572k total, 235940k used, 812632k free, 2053768k cached
Under steady-state conditions, the following shows the virtual memory size for postgres backend processes:
PID USER PR NI VIRT RES SHR S %CPU %MEM TIME+ COMMAND
8506 postgres 20 0 2327m 3084 1792 S 0.0 0.0 0:00.33 postgres
8504 postgres 20 0 2326m 14m 13m S 0.0 0.2 0:01.32 postgres
8505 postgres 20 0 2326m 728 452 S 0.0 0.0 0:00.91 postgres
3582 postgres 20 0 2325m 54m 53m S 0.0 0.7 0:02.03 postgres
My current relevant postgresql.conf settings are the following:
shared_buffers = 2100MB
temp_buffers = 8MB
work_mem = 32MB
maintenance_work_mem = 16MB
max_stack_depth = 2MB
constraint_exclusion = partition
When executing the query, I've been watching the "top" activity, sorted by resident memory. Upon execution, no other processes appear to take additional resident memory, except a postgres backend process servicing the query, which goes to +6Gb (triggering the OOM-killer). Given the settings in postgresql.conf, and my anecdotal understanding of Postgres memory management functions, I am uncertain why Postgres exhausts physical memory instead of swapping to temporary files. Do I need to lower my work_mem setting since the subquery involves a partitioned table, causing a multiplier effect to the memory used (I have tried per-connection settings of 6MB)? Would tweaking query planning settings help?
Thanks in advance!
If it helps, I have included the query (with column names aliased to their data type), a brief description of the applicable table's contents, and an abridged copy of the EXPLAIN ANALYZE output
SELECT "bigint", "date", "text"
FROM tableA AS A
WHERE A."boolean" = 'true' AND
(A."text" = 'abc' OR A."text" = 'xyz') AND
A."bigint" NOT IN (SELECT "bigint" FROM tableB)
ORDER BY A."date" DESC;
tableA:
- total table contains ~11 million records (total width: 109 bytes)
- partitioned by month (180 partitions)
- each table partition contains ~100k records
tableB:
- total table contains ~400k records (total width: 279 bytes)
- partitioned by month (96 partitions)
- each table partition contains ~30k records
EXPLAIN ANALYZE output:
Note: could not produce output for exact query due to OOM-killer, but ran query by limiting the subquery to the first 50 results. The planner iterates over all partitions, but only the first two partitions are noted for brevity.
QUERY PLAN
-----------------------------------------------------------------------------------------------------------------------------------------------------------------------
Limit (cost=451279.67..451279.70 rows=10 width=55) (actual time=18343.085..18343.090 rows=10 loops=1)
-> Sort (cost=451279.67..456398.37 rows=2047480 width=55) (actual time=18343.083..18343.087 rows=10 loops=1)
Sort Key: A."Date"
Sort Method: top-N heapsort Memory: 26kB
-> Result (cost=1.21..407034.37 rows=2047480 width=55) (actual time=0.793..17014.726 rows=4160606 loops=1)
-> Append (cost=1.21..407034.37 rows=2047480 width=55) (actual time=0.792..16119.298 rows=4160606 loops=1)
-> Seq Scan on tableA A (cost=1.21..19.08 rows=1 width=44) (actual time=0.002..0.002 rows=0 loops=1)
Filter: ("boolean" AND (NOT (hashed SubPlan 1)) AND (("text" = 'abc'::text) OR ("text" = 'xyz'::text)))
SubPlan 1
-> Limit (cost=0.00..1.08 rows=50 width=8) (actual time=0.010..0.054 rows=50 loops=210)
-> Result (cost=0.00..9249.46 rows=427846 width=8) (actual time=0.009..0.044 rows=50 loops=210)
-> Append (cost=0.00..9249.46 rows=427846 width=8) (actual time=0.008..0.031 rows=50 loops=210)
-> Seq Scan on tableB (cost=0.00..15.30 rows=530 width=8) (actual time=0.001..0.001 rows=0 loops=210)
-> Seq Scan on tableB_201201 tableB (cost=0.00..15.30 rows=530 width=8) (actual time=0.000..0.000 rows=0 loops=210)
-> Seq Scan on tableB_201112 tableB (cost=0.00..251.25 rows=12125 width=8) (actual time=0.006..0.019 rows=50 loops=210)
-> ...
-> Seq Scan on tableA_201201 A (cost=1.21..19.08 rows=1 width=44) (actual time=0.001..0.001 rows=0 loops=1)
Filter: ("boolean" AND (NOT (hashed SubPlan 1)) AND (("text" = 'abc'::text) OR ("text" = 'xyz'::text)))
SubPlan 1
-> Limit (cost=0.00..1.08 rows=50 width=8) (actual time=0.010..0.054 rows=50 loops=210)
-> Result (cost=0.00..9249.46 rows=427846 width=8) (actual time=0.009..0.044 rows=50 loops=210)
-> Append (cost=0.00..9249.46 rows=427846 width=8) (actual time=0.008..0.031 rows=50 loops=210)
-> Seq Scan on tableB (cost=0.00..15.30 rows=530 width=8) (actual time=0.001..0.001 rows=0 loops=210)
-> Seq Scan on tableB_201201 tableB (cost=0.00..15.30 rows=530 width=8) (actual time=0.000..0.000 rows=0 loops=210)
-> Seq Scan on tableB_201112 tableB (cost=0.00..251.25 rows=12125 width=8) (actual time=0.006..0.019 rows=50 loops=210)
-> ...
-> Seq Scan on tableA_201112 A (cost=1.21..794.69 rows=5980 width=55) (actual time=0.789..12.686 rows=12075 loops=1)
Filter: ("boolean" AND (NOT (hashed SubPlan 1)) AND (("text" = 'abc'::text) OR ("text" = 'xyz'::text)))
SubPlan 1
-> Limit (cost=0.00..1.08 rows=50 width=8) (actual time=0.010..0.054 rows=50 loops=210)
-> Result (cost=0.00..9249.46 rows=427846 width=8) (actual time=0.009..0.044 rows=50 loops=210)
-> Append (cost=0.00..9249.46 rows=427846 width=8) (actual time=0.008..0.031 rows=50 loops=210)
-> Seq Scan on tableB (cost=0.00..15.30 rows=530 width=8) (actual time=0.001..0.001 rows=0 loops=210)
-> Seq Scan on tableB_201201 tableB (cost=0.00..15.30 rows=530 width=8) (actual time=0.000..0.000 rows=0 loops=210)
-> Seq Scan on tableB_201112 tableB (cost=0.00..251.25 rows=12125 width=8) (actual time=0.006..0.019 rows=50 loops=210)
-> Seq Scan on tableB_201111 tableB (cost=0.00..604.89 rows=29189 width=8) (never executed)
-> ...
-> Seq Scan on tableA_201111 A (cost=1.21..2666.12 rows=14670 width=55) (actual time=0.441..36.680 rows=29189 loops=1)
Filter: ("boolean" AND (NOT (hashed SubPlan 1)) AND (("text" = 'abc'::text) OR ("text" = 'xyz'::text)))
SubPlan 1
-> Limit (cost=0.00..1.08 rows=50 width=8) (actual time=0.010..0.054 rows=50 loops=210)
-> Result (cost=0.00..9249.46 rows=427846 width=8) (actual time=0.009..0.044 rows=50 loops=210)
-> Append (cost=0.00..9249.46 rows=427846 width=8) (actual time=0.008..0.031 rows=50 loops=210)
-> Seq Scan on tableB (cost=0.00..15.30 rows=530 width=8) (actual time=0.001..0.001 rows=0 loops=210)
-> Seq Scan on tableB_201201 tableB (cost=0.00..15.30 rows=530 width=8) (actual time=0.000..0.000 rows=0 loops=210)
-> Seq Scan on tableB_201112 tableB (cost=0.00..251.25 rows=12125 width=8) (actual time=0.006..0.019 rows=50 loops=210)
-> Seq Scan on tableB_201111 tableB (cost=0.00..604.89 rows=29189 width=8) (never executed)
-> ...
-> ...
Total runtime: 18359.851 ms
(23327 rows)