PostgreSQL 9.1 : why is this query slow?

Joost Kraaijeveld <J.Kraaijeveld@xxxxxxxxxx> · Mon, 28 Nov 2011 17:42:06 +0100

Hi All,

I have a table with 665605 rows (counted, vacuum-ed):
CREATE TABLE unique_words
( filename text NOT NULL,
  filetype text NOT NULL,
  word text NOT NULL,
  count integer,)

The query is:
select f.word , count(f.word) from 
unique_words as f, 
unique_words as s ,
unique_words as n 
where
(f.word = s.word and s.word = n.word)
and
(f.filetype = 'f' and s.filetype = 's' and n.filetype = 'n')
group by f.word

Explain says:
"GroupAggregate  (cost=0.00..67237557.88 rows=1397 width=6)"
"  ->  Nested Loop  (cost=0.00..27856790.31 rows=7876150720 width=6)"
"        ->  Nested Loop  (cost=0.00..118722.04 rows=14770776 width=12)"
"              ->  Index Scan using idx_unique_words_filetype_word on unique_words f  (cost=0.00..19541.47 rows=92098 width=6)"
"                    Index Cond: (filetype = 'f'::text)"
"              ->  Index Scan using idx_unique_words_filetype_word on unique_words s  (cost=0.00..0.91 rows=13 width=6)"
"                    Index Cond: ((filetype = 's'::text) AND (word = f.word))"
"        ->  Index Scan using idx_unique_words_filetype_word on unique_words n  (cost=0.00..1.33 rows=44 width=6)"
"              Index Cond: ((filetype = 'n'::text) AND (word = f.word))"

The right answer should be 3808 different words (according to a Java
program I wrote).

This query takes more than 1 hour (after which I cancelled the query).
My questions are:
- Is this to be expected?
- Especially as the query over just 1 join takes 32 secs? (on f.word =
s.word omitting everything for n )
- Why does explain say it takes "7876150720 rows"? 
- Is there a way to rephrase the query that makes it faster?
- Could another table layout help (f,s,n are all possibilities for
filetype)?
- Anything else?????

TIA

-- 
Groeten,

Joost Kraaijeveld
Askesis B.V.
Molukkenstraat 14
6524NB Nijmegen
tel: 024-3888063 / 06-51855277
fax: 024-3608416
web: www.askesis.nl

-- 
Sent via pgsql-performance mailing list (pgsql-performance@xxxxxxxxxxxxxx)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-performance