Running PostgreSQL 8.4.7 (backport package from Debian Lenny).
I have some queries that are based on views, and an engine adds a few clauses (like NULLS LAST). One of these queries has a performance problem.
The simplified form is this:
shs=# explain analyze select * from performance e JOIN part v ON v.performance_id = e.id order by e.creation_date desc limit 10;
QUERY PLAN
-----------------------------------------------------------------------------------------------------------------------------------------------------------------------
Limit (cost=0.00..4.25 rows=10 width=312) (actual time=0.078..0.147 rows=10 loops=1)
-> Nested Loop (cost=0.00..62180.28 rows=146294 width=312) (actual time=0.078..0.145 rows=10 loops=1)
-> Index Scan Backward using performance_create_idx on performance e (cost=0.00..12049.21 rows=145379 width=247) (actual time=0.051..0.087 rows=10 loops=1)
-> Index Scan using part_performance_idx on part v (cost=0.00..0.33 rows=1 width=65) (actual time=0.005..0.005 rows=1 loops=10)
Index Cond: (v.performance_id = e.id)
Total runtime: 0.205 ms
creation_date is declared as NOT NULL, and since it's a inner join, creation_date can never be null in this query. I'd think that if I add NULLS LAST, it wouldn't have any effect.
However, the query with NULLS LAST (as generated by the engine):
shs=# explain analyze select * from performance e JOIN part v ON v.performance_id = e.id order by e.creation_date desc nulls last limit 10;
QUERY PLAN
--------------------------------------------------------------------------------------------------------------------------------------------------------------------
Limit (cost=25773.76..25773.79 rows=10 width=312) (actual time=492.959..492.963 rows=10 loops=1)
-> Sort (cost=25773.76..26139.50 rows=146294 width=312) (actual time=492.958..492.962 rows=10 loops=1)
Sort Key: e.creation_date
Sort Method: top-N heapsort Memory: 27kB
-> Merge Join (cost=1.27..22612.40 rows=146294 width=312) (actual time=0.064..367.160 rows=146294 loops=1)
Merge Cond: (e.id = v.performance_id)
-> Index Scan using performance_pkey on performance e (cost=0.00..11989.20 rows=145379 width=247) (actual time=0.035..160.838 rows=145379 loops=1)
-> Index Scan using part_performance_idx on part v (cost=0.00..8432.35 rows=146294 width=65) (actual time=0.025..91.084 rows=146294 loops=1)
Total runtime: 493.062 ms
Both tables have around 150k rows as you can read from the last plan.
Table performance:
Table "public.performance"
Column | Type | Modifiers
-----------------+--------------------------+----------------------------------------------------------
created_by | integer | not null
creation_date | timestamp with time zone | not null
comments | text |
owned_by | integer | not null
id | integer | not null default nextval('performance_id_seq'::regclass)
title | text |
title_ | text |
performer_id | integer |
first_medium_id | integer |
vperf_id | integer |
perf_date | partial_date |
bonustrack | boolean | not null default false
type_id | integer | not null
instrumental | boolean | not null default false
init_rev_level | smallint | not null default 1
curr_rev_level | smallint | not null default 1
revision_date | timestamp with time zone |
revised_by | integer |
object_type | text | not null default 'performance'::text
editor_note | text |
active | boolean | not null default true
Indexes:
"performance_pkey" PRIMARY KEY, btree (id)
"performance_create_idx" btree (creation_date)
"performance_medium_idx" btree (first_medium_id)
"performance_own_idx" btree (owned_by)
"performance_performer_idx" btree (performer_id)
Table part:
Table "public.part"
Column | Type | Modifiers
----------------+--------------------------+---------------------------------------------------
created_by | integer | not null
creation_date | timestamp with time zone |
comments | text |
owned_by | integer | not null
id | integer | not null default nextval('part_id_seq'::regclass)
work_id | integer | not null
performance_id | integer | not null
Indexes:
"part_pkey" PRIMARY KEY, btree (id)
"part_own_idx" btree (owned_by)
"part_performance_idx" btree (performance_id)
"part_work_idx" btree (work_id)
Please advise!
Thanks.
Kind regards,
Mathieu