Re: planer picks a bad plan (seq-scan instead of index) when adding an additional join

"Thomas H." <me@xxxxxxxxxxxxx> · Thu, 9 Nov 2006 12:59:33 +0100

1. You MUST sequence scan dvds, as there is no way to do an index search 
on a like with % at the beginning.
2. You are asking for a left join on dvds, which means you want all 
records, so you must sequence scan dvds.  The filters are all OR, so you 
can't say that a records is excluded until AFTER the join is completed.

true, but thats fast (200ms).

3. The join estimates that 39900 rows will come out, but only 2 do, thats 
out by a large factor 10^4, which means that it's not going to join movies 
after the filter is applied.

20866 is the total number of rows in the dvd table. the planer is now 
showing the accurate rate after another (auto)vacuum run. of course it can't 
know the estimate of a '%...' comparsion, so estimating the full result set 
is ok.

now, an additional table (containing 600k records) is added through a 
left join. all the sudden the query takes 24sec. although there are 
indices on both dvds.dvd_ean and data_soundmedia.sm_info_ean, the planer 
does not make use of the indices but rather chooses to do 2 seq-scans.
The 2 items from the first query are still true.  You just have a lot more 
records to play with now by joining in a 3rd table.  Which means there is 
no way to reduce the possible output set before the join.

well, under normal cases, the output set would be determined by index 
lookups (see bottom)

->  Sort  (cost=286162.37..287781.38 rows=647601 width=660) (actual 
time=19336.011..20328.247 rows=646633 loops=1)
This sort here is where nearly all of the 24 seconds goes. I am assuming 
at this point that the Merge Join is chosen because of the stats problem 
listed next;

exactly. but that sort shouldn't happen as there is an index on the 
join-field, and that is usualy pretty fast (~400ms), but not here...

>  Merge Left Join  (cost=309190.05..313899.09 rows=159086 width=1722)
(actual time=19876.552..21902.007 rows=20866 loops=1)
This estimate is out by a factor of 10.  I'd increase the statistics on 
the join columns to see if it decides to use a different join method.

If the estimate for the Merge join is fixed to be closed, then it's likely 
an index scan would be chosen, 159000 is about 25% of the table.  Assuming 
it's small on disk then it's faster to do a seq_scan than all the random 
I/O to read the index, then the heap to produce results.

the discs are not the fastest, but there is plenty of free ram available. 
seq_page_cost is in its default state (1.0). should i raise this?

How big is data_soundmedia? Mb size, not rows.
How much is your sort_mem/work_mem?

data_soundmedia is 195mb + 105mb for indices (6 fields indexed).
work_mem = 30MB, sort_mem is undefined in 8.2's postgresql.conf

what troubles me is that its only slow with this table (data_soundmedia). we 
have other tables (400k and 200k entries) that are joined with the same 
query in under 400ms total. these tables do have the exact same structure 
and indices defined, but in these joins the planer properly uses an Index 
Scann...

if i force enable_seqscan = off, the planer makes use of the index, 
resulting in acceptable query speed:

Nested Loop Left Join  (cost=8402.16..257761.36 rows=83223 width=1067) 
(actual time=361.931..713.405 rows=2 loops=1)
 ->  Hash Left Join  (cost=8402.16..11292.37 rows=20873 width=407) (actual 
time=322.085..666.519 rows=2 loops=1)
       Hash Cond: (dvds.dvd_mov_id = movies.mov_id)
       Filter: ((lower((movies.mov_name)::text) ~~ '%superman re%'::text) 
OR (lower((dvds.dvd_name)::text) ~~ '%superman re%'::text) OR 
(lower((dvds.dvd_edition)::text) ~~ '%superman re%'::text))
       ->  Seq Scan on dvds  (cost=0.00..804.73 rows=20873 width=193) 
(actual time=11.781..329.672 rows=20866 loops=1)
       ->  Hash  (cost=8194.93..8194.93 rows=82893 width=214) (actual 
time=200.823..200.823 rows=37418 loops=1)
             ->  Seq Scan on movies  (cost=0.00..8194.93 rows=82893 
width=214) (actual time=0.070..155.178 rows=37418 loops=1)
 ->  Index Scan using data_soundmedia_info_ean_idx on data_soundmedia 
(cost=0.00..11.76 rows=4 width=660) (actual time=23.424..23.428 rows=1 
loops=2)
       Index Cond: ((data_soundmedia.sm_info_ean)::text = 
(dvds.dvd_ean)::text)
Total runtime: 716.988 ms

2 rows fetched (821 ms)

could it be the index gets somehow corrupted? but on the other hand, if i do 
a TRUNCATE before loading new data, it should be rebuild anyway, shouldn't 
it?

thanks,
thomas