On 11/26/2008 02:15 PM, Scott Marlowe wrote: > On Tue, Nov 25, 2008 at 10:07 PM, Clemens Schwaighofer > <clemens.schwaighofer@xxxxxxxxxx> wrote: >> On 11/26/2008 02:04 PM, Scott Marlowe wrote: >>> On Tue, Nov 25, 2008 at 8:39 PM, Clemens Schwaighofer >>> <clemens.schwaighofer@xxxxxxxxxx> wrote: >>>> but on the 8.3 version i get this back >>>> >>>> # explain select * from foo f, bar b where f.foo_id = b.foo_id; >>>> QUERY PLAN >>>> ------------------------------------------------------------------ >>>> Hash Join (cost=1.07..2.14 rows=3 width=24) >>>> Hash Cond: (b.foo_id = f.foo_id) >>>> -> Seq Scan on bar b (cost=0.00..1.03 rows=3 width=14) >>>> -> Hash (cost=1.03..1.03 rows=3 width=10) >>>> -> Seq Scan on foo f (cost=0.00..1.03 rows=3 width=10) >>> Of course it uses a seq scan. All the data fits handily into a single >>> page I assume. >> okay, the strange thing is, that in 8.2 it always used an index scan. > > Are there more rows in the 8.2 table you're testing on? Or is the > whole table small enough to fit on a few pages? I highly doubt that. I have right now in one of the DBs I transfered tables from ~100.000 down to ~40.000 rows that all join together. I somehow really doubt that fit in a few pages. That is why I was so surprised to see such a big difference in the explain. > >>>> once I insert a million rows he does use the index: >>>> >>>> # explain select * from foo f, bar b where f.foo_id = b.foo_id; >>>> QUERY PLAN >>>> ----------------------------------------------------------------------------------- >>>> Nested Loop (cost=0.00..26.39 rows=9 width=35) >>>> -> Seq Scan on foo f (cost=0.00..1.03 rows=3 width=21) >>>> -> Index Scan using bar_foo_id_idx on bar b (cost=0.00..8.42 rows=3 >>>> width=14) >>>> Index Cond: (b.foo_id = f.foo_id) >>> I don't see a million rows here, only three. Have you run analyze >>> after loading all that data? Or is it retrieving 3 rows out of a >>> million? If so then an index scan does make sense. >> yeah, there are 3 matching rows, and the rest is just data to make the >> table big. >> >> I am just still confused, because if Postgres does only use seq scan >> even in very large databases, I am worried I do something very wrong in >> my DB design ... > > Postgresql has no visibility in its indexes, meaning that whether it > uses an index or not, it still has to go to the table to see if the > tuple is actually visible to this transaction. For this reason, > PostgreSQL switches to sequential scans quicker than other dbs that > have visibility information in their indexes. > > The planner is pretty smart, but if you're going to hit a large % of > the table anyway, it switches to sequential scans since it will have > to retreive the majority of the table anyway. So, I am fine when I trust the Postgresql planner :) Because speed wise I see no difference that 8.3 would be slower than 8.2 -- [ Clemens Schwaighofer -----=====:::::~ ] [ IT Engineer/Manager ] [ E-Graphics Communications, TEQUILA\ Japan IT Group ] [ 6-17-2 Ginza Chuo-ku, Tokyo 104-8167, JAPAN ] [ Tel: +81-(0)3-3545-7706 Fax: +81-(0)3-3545-7343 ] [ http://www.tequila.jp ]
Attachment:
signature.asc
Description: OpenPGP digital signature