I am trying to rewrite a query here, that takes 1.5m atm to finish. I got it down to 20s, and still trying to pin it down.
basically, a query looks something like that atm:
select a.*, b.*
from a
join b on a.id = b.a_id and a.banned <> true
where
a.start <= now()
and
b.end > now();
that's 20s query, and now I got it down to 10s , by using something - which in my eyes would be always wrong - and against all logic. So if someone could please explain to me why is it faster:
select a.*, b.*
from foo a
join bar b on a.id = b.a_id
where
not exists (
select id from foo where foo.id = b.a_id and foo.banned <> true
)
and
a.start <= now()
and
b.end > now();
plans differ, obviously - second one uses index to lookup .banned in foo, whilst first one goes for seq scan.
result is the same, but I was actually expecting quite opposite. So is join on 1-2M rows a bad idea ?
The effect can be seen on both 8.1 and cvs head.
I would be grateful for someone clarifying that to me.
--
GJ
GJ