Sergei Shelukhin wrote:
This is just an example isolating the problem.
Actual queries contain more tables and more joins and return
reasonable amount of data.
Performance of big indices however is appalling, with planner always
reverting to seqscan with default settings.
I tried to pre-filter the data as much as possible in preceding joins
to put less strain on the offending join (less rows to join by?) but
it doesn't help.
I wonder what exactly makes index perform 100+ times slower than
seqscan - I mean even if it's perfromed on the HD which it should not
be given the index size, index and table are on the same HD and index
is smaller and also sorted, isn't it?
set enable_seqscan = on;
explain select * from entries inner join stuff on entries.id =
stuff.id;
-> Seq Scan on stuff (cost=0.00..193344.00 rows=12550400 width=12)
set enable_seqscan = off;
explain select * from entries inner join stuff on entries.id =
stuff.id;
-> Index Scan using blah on stuff (cost=0.00..25406232.30
rows=12550400 width=12)
I don't think you really understand postgresql very well.
There's no evidence that the index scan is 100 times slower. The
planner is guesstimating that it will take much longer for the index to
do the same work.
Do some research on postgresql's MVCC model and "visibility". The
indexes don't have it, so every access has to hit the tables anyway, so
if the query planner figures you're going to hit 10% or so of the table,
just seq scan it and go.
Run your queries with "explain analyze" and see which is faster. If the
seq scan is faster, then pgsql made the right choice. What you can do
to speed it up depends on your system.
Post the output of explain analyze select ... here and let us look at it.
More importantly, post your REAL Queries with explain analyze output
(not just explain) and let's see what we see.