Re: query produces 1 GB temp file

Dirk Lutzebäck <lutzeb@xxxxxxxxxx> · Fri, 27 Oct 2006 16:51:25 +0200

Hi,

I'm sorry but it look like my computer has resent older posts from me,
sigh...

Dirk

Alexander Staubo wrote:
While I can't explain why PostgreSQL would use that
memory, I recommend looking into tweaking the work_mem parameter. This
setting specifies how much memory PostgreSQL on certain temporary data
structures (hash tables, sort vectors) until it starts using temporary
files. Quoting the docs:

  work_mem (integer)

Specifies the amount of memory to be used by internal sort operations
and hash tables before switching to temporary disk files. The value is
specified in kilobytes, and defaults to 1024 kilobytes (1 MB). Note
that for a complex query, several sort or hash operations might be
running in parallel; each one will be allowed to use as much memory as
this value specifies before it starts to put data into temporary files.
Also, several running sessions could be doing such operations
concurrently. So the total memory used could be many times the value of
work_mem; it is necessary to keep this fact in mind when choosing the
value. Sort operations are used for ORDER BY, DISTINCT, and merge
joins. Hash tables are used in hash joins, hash-based aggregation, and
hash-based processing of IN subqueries.

Alexander.

On Feb 5, 2005, at 18:25 , Dirk Lutzebaeck wrote:

  Hi,

here is a query which produces over 1G temp file in pgsql_tmp. This

is on pgsql 7.4.2, RHEL 3.0, XEON MP machine with 32GB RAM, 300MB

sort_mem and 320MB shared_mem.

Below is the query and results for EXPLAIN and EXPLAIN ANALYZE. All

tables have been analyzed before.

Can some please explain why the temp file is so huge? I understand

there are a lot of rows. All relevant indices seem to be used.

Thanks in advance,

Dirk

EXPLAIN

SELECT DISTINCT ON (ft.val_9, ft.created, ft.flatid) ft.docstart,
ft.flatobj, bi.oid, bi.en

FROM bi, en, df AS ft, es

WHERE bi.rc=130170467

AND bi.en=ft.en

AND bi.co=117305223

AND bi.hide=FALSE

AND ft.en=en.oid

AND es.en=bi.en

AND es.co=bi.co

AND es.spec=122293729

AND (ft.val_2='DG' OR ft.val_2='SK')

AND ft.docstart=1

ORDER BY ft.val_9 ASC, ft.created DESC

LIMIT 1000 OFFSET 0;

 Limit  (cost=8346.75..8346.78 rows=3 width=1361)

   ->  Unique  (cost=8346.75..8346.78 rows=3 width=1361)

         ->  Sort  (cost=8346.75..8346.76 rows=3 width=1361)

               Sort Key: ft.val_9, ft.created, ft.flatid

               ->  Nested Loop  (cost=0.00..8346.73 rows=3
width=1361)

                     ->  Nested Loop  (cost=0.00..5757.17 rows=17
width=51)

                           ->  Nested Loop  (cost=0.00..5606.39
rows=30 width=42)

                                 ->  Index Scan using es_sc_index on
es  (cost=0.00..847.71 rows=301 width=8)

                                       Index Cond: ((spec = 122293729)
AND (co = 117305223::oid))

                                 ->  Index Scan using bi_env_index
on bi  (cost=0.00..15.80 rows=1 width=42)

                                       Index Cond: ("outer".en = bi.en)

                                       Filter: ((rc = 130170467::oid)
AND (co = 117305223::oid) AND (hide = false))

                           ->  Index Scan using en_oid_index on en 
(cost=0.00..5.01 rows=1 width=9)

                                 Index Cond: ("outer".en = en.oid)

                     ->  Index Scan using df_en on df ft 
(cost=0.00..151.71 rows=49 width=1322)

                           Index Cond: ("outer".en = ft.en)

                           Filter: (((val_2 = 'DG'::text) OR (val_2 =
'SK'::text)) AND (docstart = 1))

(17 rows)

--------------

EXPLAIN ANALYZE gives:

 Limit  (cost=8346.75..8346.78 rows=3 width=1361) (actual
time=75357.465..75679.964 rows=1000 loops=1)

   ->  Unique  (cost=8346.75..8346.78 rows=3 width=1361) (actual
time=75357.459..75675.371 rows=1000 loops=1)

         ->  Sort  (cost=8346.75..8346.76 rows=3 width=1361) (actual
time=75357.448..75499.263 rows=22439 loops=1)

               Sort Key: ft.val_9, ft.created, ft.flatid

               ->  Nested Loop  (cost=0.00..8346.73 rows=3
width=1361) (actual time=34.104..18016.005 rows=703677 loops=1)

                     ->  Nested Loop  (cost=0.00..5757.17 rows=17
width=51) (actual time=0.467..3216.342 rows=48563 loops=1)

                           ->  Nested Loop  (cost=0.00..5606.39
rows=30 width=42) (actual time=0.381..1677.014 rows=48563 loops=1)

                                 ->  Index Scan using es_sc_index on
es  (cost=0.00..847.71 rows=301 width=8) (actual time=0.184..46.519
rows=5863 loops=1)

                                       Index Cond: ((spec = 122293729)
AND (co = 117305223::oid))

                                 ->  Index Scan using bi_env_index
on bi  (cost=0.00..15.80 rows=1 width=42) (actual time=0.052..0.218
rows=8 loops=5863)

                                       Index Cond: ("outer".en = bi.en)

                                       Filter: ((rc = 130170467::oid)
AND (co = 117305223::oid) AND (hide = false))

                           ->  Index Scan using en_oid_index on en 
(cost=0.00..5.01 rows=1 width=9) (actual time=0.015..0.019 rows=1
loops=48563)

                                 Index Cond: ("outer".en = en.oid)

                     ->  Index Scan using df_en on df ft 
(cost=0.00..151.71 rows=49 width=1322) (actual time=0.038..0.148
rows=14 loops=48563)

                           Index Cond: ("outer".en = ft.en)

                           Filter: (((val_2 = 'DG'::text) OR (val_2 =
'SK'::text)) AND (docstart = 1))

 Total runtime: 81782.052 ms

(18 rows)

---------------------------(end of
broadcast)---------------------------

TIP 5: don't forget to increase your free space map settings

-- 

Dirk Lutzebäck <lutzeb@xxxxxxxxxx> Tel +49.30.5362.1635
Fax .1638

CTO AEC/communications GmbH & Co.
KG, Berlin, Germany