Re: query produces 1 GB temp file

Alexander Staubo <alex@xxxxxxxxxxxxxxx> · Fri, 27 Oct 2006 16:47:45 +0200

While I can't explain why PostgreSQL would use that memory, I  
recommend looking into tweaking the work_mem parameter. This setting  
specifies how much memory PostgreSQL on certain temporary data  
structures (hash tables, sort vectors) until it starts using  
temporary files. Quoting the docs:

work_mem (integer)
Specifies the amount of memory to be used by internal sort  
operations and hash tables before switching to temporary disk  
files. The value is specified in kilobytes, and defaults to 1024  
kilobytes (1 MB). Note that for a complex query, several sort or  
hash operations might be running in parallel; each one will be  
allowed to use as much memory as this value specifies before it  
starts to put data into temporary files. Also, several running  
sessions could be doing such operations concurrently. So the total  
memory used could be many times the value of work_mem; it is  
necessary to keep this fact in mind when choosing the value. Sort  
operations are used for ORDER BY, DISTINCT, and merge joins. Hash  
tables are used in hash joins, hash-based aggregation, and hash- 
based processing of IN subqueries.

Alexander.

On Feb 5, 2005, at 18:25 , Dirk Lutzebaeck wrote:

Hi,

here is a query which produces over 1G temp file in pgsql_tmp. This
is on pgsql 7.4.2, RHEL 3.0, XEON MP machine with 32GB RAM, 300MB
sort_mem and 320MB shared_mem.

Below is the query and results for EXPLAIN and EXPLAIN ANALYZE. All
tables have been analyzed before.

Can some please explain why the temp file is so huge? I understand
there are a lot of rows. All relevant indices seem to be used.

Thanks in advance,

Dirk

EXPLAIN
SELECT DISTINCT ON (ft.val_9, ft.created, ft.flatid) ft.docstart,  
ft.flatobj, bi.oid, bi.en
FROM bi, en, df AS ft, es
WHERE bi.rc=130170467
AND bi.en=ft.en
AND bi.co=117305223
AND bi.hide=FALSE
AND ft.en=en.oid
AND es.en=bi.en
AND es.co=bi.co
AND es.spec=122293729
AND (ft.val_2='DG' OR ft.val_2='SK')
AND ft.docstart=1
ORDER BY ft.val_9 ASC, ft.created DESC
LIMIT 1000 OFFSET 0;

 Limit  (cost=8346.75..8346.78 rows=3 width=1361)
   ->  Unique  (cost=8346.75..8346.78 rows=3 width=1361)
         ->  Sort  (cost=8346.75..8346.76 rows=3 width=1361)
               Sort Key: ft.val_9, ft.created, ft.flatid
               ->  Nested Loop  (cost=0.00..8346.73 rows=3 width=1361)
                     ->  Nested Loop  (cost=0.00..5757.17 rows=17  
width=51)
                           ->  Nested Loop  (cost=0.00..5606.39  
rows=30 width=42)
                                 ->  Index Scan using es_sc_index  
on es  (cost=0.00..847.71 rows=301 width=8)
                                       Index Cond: ((spec =  
122293729) AND (co = 117305223::oid))
                                 ->  Index Scan using bi_env_index  
on bi  (cost=0.00..15.80 rows=1 width=42)
                                       Index Cond: ("outer".en =  
bi.en)
                                       Filter: ((rc =  
130170467::oid) AND (co = 117305223::oid) AND (hide = false))
                           ->  Index Scan using en_oid_index on en   
(cost=0.00..5.01 rows=1 width=9)
                                 Index Cond: ("outer".en = en.oid)
                     ->  Index Scan using df_en on df ft   
(cost=0.00..151.71 rows=49 width=1322)
                           Index Cond: ("outer".en = ft.en)
                           Filter: (((val_2 = 'DG'::text) OR (val_2  
= 'SK'::text)) AND (docstart = 1))
(17 rows)

--------------

EXPLAIN ANALYZE gives:

 Limit  (cost=8346.75..8346.78 rows=3 width=1361) (actual  
time=75357.465..75679.964 rows=1000 loops=1)
   ->  Unique  (cost=8346.75..8346.78 rows=3 width=1361) (actual  
time=75357.459..75675.371 rows=1000 loops=1)
         ->  Sort  (cost=8346.75..8346.76 rows=3 width=1361)  
(actual time=75357.448..75499.263 rows=22439 loops=1)
               Sort Key: ft.val_9, ft.created, ft.flatid
               ->  Nested Loop  (cost=0.00..8346.73 rows=3  
width=1361) (actual time=34.104..18016.005 rows=703677 loops=1)
                     ->  Nested Loop  (cost=0.00..5757.17 rows=17  
width=51) (actual time=0.467..3216.342 rows=48563 loops=1)
                           ->  Nested Loop  (cost=0.00..5606.39  
rows=30 width=42) (actual time=0.381..1677.014 rows=48563 loops=1)
                                 ->  Index Scan using es_sc_index  
on es  (cost=0.00..847.71 rows=301 width=8) (actual  
time=0.184..46.519 rows=5863 loops=1)
                                       Index Cond: ((spec =  
122293729) AND (co = 117305223::oid))
                                 ->  Index Scan using bi_env_index  
on bi  (cost=0.00..15.80 rows=1 width=42) (actual time=0.052..0.218  
rows=8 loops=5863)
                                       Index Cond: ("outer".en =  
bi.en)
                                       Filter: ((rc =  
130170467::oid) AND (co = 117305223::oid) AND (hide = false))
                           ->  Index Scan using en_oid_index on en   
(cost=0.00..5.01 rows=1 width=9) (actual time=0.015..0.019 rows=1  
loops=48563)
                                 Index Cond: ("outer".en = en.oid)
                     ->  Index Scan using df_en on df ft   
(cost=0.00..151.71 rows=49 width=1322) (actual time=0.038..0.148  
rows=14 loops=48563)
                           Index Cond: ("outer".en = ft.en)
                           Filter: (((val_2 = 'DG'::text) OR (val_2  
= 'SK'::text)) AND (docstart = 1))
 Total runtime: 81782.052 ms
(18 rows)

---------------------------(end of  
broadcast)---------------------------
TIP 5: don't forget to increase your free space map settings