Hi everyone,
I'm having trouble getting rid of a sequential scan on a table with roughly 120k entries it. Creation of an index on that particular column which triggers the sequential scan doesn't do anything, VACUUM and ANALYZE has been done on the table.
The table in question has the following definition:
Column | Type | Modifiers
--------------------+--------------------------+------------------------------------------------------------------
post_id | bigint | not null default nextval('posts_post_id_seq'::regclass)
forum_id | bigint | not null
threadlink | character varying(255) | not null
timestamp | timestamp with time zone | not null
poster_id | bigint |
thread_id | bigint | not null
subject | text | not null
text | text | not null
postername | character varying(255) |
internal_post_id | bigint | not null default nextval('posts_internal_post_id_seq'::regclass)
internal_thread_id | bigint |
Indexes:
"posts_pkey" PRIMARY KEY, btree (internal_post_id)
"posts_forum_id_key" UNIQUE, btree (forum_id, post_id)
"idx_internal_thread_id" btree (internal_thread_id)
"idx_posts_poster_id" btree (poster_id)
Foreign-key constraints:
"posts_forum_id_fkey" FOREIGN KEY (forum_id) REFERENCES forums(forum_id)
"posts_internal_thread_id_fkey" FOREIGN KEY (internal_thread_id) REFERENCES threads(internal_thread_id)
"posts_poster_id_fkey" FOREIGN KEY (poster_id) REFERENCES posters(poster_id)
The query is this:
SELECT threads.internal_thread_id AS threads_internal_thread_id, threads.forum_id AS threads_forum_id, threads.thread_id AS threads_thread_id, threads.title AS threads_title, threads.poster_id AS threads_poster_id, threads.postername AS threads_postername, threads.category AS threads_category, threads.posttype AS threads_posttype FROM threads JOIN
posts ON threads.internal_thread_id = posts.internal_thread_id JOIN posters ON posts.poster_id = posters.poster_id JOIN posters_groups AS posters_groups_1 ON posters.poster_id = posters_groups_1.poster_id JOIN groups ON groups.group_id = posters_groups_1.group_id WHERE groups.group_id = 4 ORDER BY posts.timestamp DESC;
The query plan (with an explain analyze) gives me the following:
QUERY PLAN
----------------------------------------------------------------------------------------------------------------------------------------------------------------------
Sort (cost=13995.93..14006.63 rows=4279 width=108) (actual time=79.927..79.947 rows=165 loops=1)
Sort Key: posts."timestamp"
Sort Method: quicksort Memory: 50kB
-> Nested Loop (cost=6.97..13737.84 rows=4279 width=108) (actual time=0.605..79.693 rows=165 loops=1)
-> Seq Scan on groups (cost=0.00..1.05 rows=1 width=8) (actual time=0.013..0.014 rows=1 loops=1)
Filter: (group_id = 4)
-> Nested Loop (cost=6.97..13694.00 rows=4279 width=116) (actual time=0.587..79.616 rows=165 loops=1)
-> Hash Join (cost=6.97..12343.10 rows=4279 width=24) (actual time=0.568..78.230 rows=165 loops=1)
Hash Cond: (posts.poster_id = posters.poster_id)
-> Seq Scan on posts (cost=0.00..11862.12 rows=112312 width=24) (actual time=0.019..60.092 rows=112312 loops=1)
-> Hash (cost=6.79..6.79 rows=14 width=24) (actual time=0.101..0.101 rows=14 loops=1)
-> Hash Join (cost=2.14..6.79 rows=14 width=24) (actual time=0.060..0.093 rows=14 loops=1)
Hash Cond: (posters.poster_id = posters_groups_1.poster_id)
-> Seq Scan on posters (cost=0.00..3.83 rows=183 width=8) (actual time=0.006..0.023 rows=185 loops=1)
-> Hash (cost=1.96..1.96 rows=14 width=16) (actual time=0.025..0.025 rows=14 loops=1)
-> Seq Scan on posters_groups posters_groups_1 (cost=0.00..1.96 rows=14 width=16) (actual time=0.016..0.021 rows=14 loops=1)
Filter: (group_id = 4)
-> Index Scan using threads_pkey on threads (cost=0.00..0.30 rows=1 width=100) (actual time=0.006..0.007 rows=1 loops=165)
Index Cond: (threads.internal_thread_id = posts.internal_thread_id)
Total runtime: 80.137 ms
(20 rows)
So the big time lost is in this line:
Seq Scan on posts (cost=0.00..11862.12 rows=112312 width=24) (actual time=0.019..60.092 rows=112312 loops=1)
which I can understand why it slow ;)
But I haven't yet managed to convert the Seq Scan into an Index Scan, and I'm not sure how to continue there.
As I am not a big expert on psql optimization, any input would be greatly appreciated.
Best regards,
Jens