Re: Postgres 8.3 only uses seq scan

Clemens Schwaighofer <clemens.schwaighofer@xxxxxxxxxx> · Wed, 26 Nov 2008 14:22:29 +0900

On 11/26/2008 02:15 PM, Scott Marlowe wrote:
> On Tue, Nov 25, 2008 at 10:07 PM, Clemens Schwaighofer
> <clemens.schwaighofer@xxxxxxxxxx> wrote:
>> On 11/26/2008 02:04 PM, Scott Marlowe wrote:
>>> On Tue, Nov 25, 2008 at 8:39 PM, Clemens Schwaighofer
>>> <clemens.schwaighofer@xxxxxxxxxx> wrote:
>>>> but on the 8.3 version i get this back
>>>>
>>>> # explain select * from foo f, bar b where f.foo_id = b.foo_id;
>>>>                            QUERY PLAN
>>>> ------------------------------------------------------------------
>>>>  Hash Join  (cost=1.07..2.14 rows=3 width=24)
>>>>   Hash Cond: (b.foo_id = f.foo_id)
>>>>   ->  Seq Scan on bar b  (cost=0.00..1.03 rows=3 width=14)
>>>>   ->  Hash  (cost=1.03..1.03 rows=3 width=10)
>>>>         ->  Seq Scan on foo f  (cost=0.00..1.03 rows=3 width=10)
>>> Of course it uses a seq scan.  All the data fits handily into a single
>>> page I assume.
>> okay, the strange thing is, that in 8.2 it always used an index scan.
> 
> Are there more rows in the 8.2 table you're testing on?  Or is the
> whole table small enough to fit on a few pages?

I highly doubt that. I have right now in one of the DBs I transfered
tables from ~100.000 down to ~40.000 rows that all join together. I
somehow really doubt that fit in a few pages.

That is why I was so surprised to see such a big difference in the explain.

> 
>>>> once I insert a million rows he does use the index:
>>>>
>>>> # explain select * from foo f, bar b where f.foo_id = b.foo_id;
>>>>                                    QUERY PLAN
>>>> -----------------------------------------------------------------------------------
>>>>  Nested Loop  (cost=0.00..26.39 rows=9 width=35)
>>>>   ->  Seq Scan on foo f  (cost=0.00..1.03 rows=3 width=21)
>>>>   ->  Index Scan using bar_foo_id_idx on bar b  (cost=0.00..8.42 rows=3
>>>> width=14)
>>>>         Index Cond: (b.foo_id = f.foo_id)
>>> I don't see a million rows here, only three.  Have you run analyze
>>> after loading all that data?  Or is it retrieving 3 rows out of a
>>> million?  If so then an index scan does make sense.
>> yeah, there are 3 matching rows, and the rest is just data to make the
>> table big.
>>
>> I am just still confused, because if Postgres does only use seq scan
>> even in very large databases, I am worried I do something very wrong in
>> my DB design ...
> 
> Postgresql has no visibility in its indexes, meaning that whether it
> uses an index or not, it still has to go to the table to see if the
> tuple is actually visible to this transaction.  For this reason,
> PostgreSQL switches to sequential scans quicker than other dbs that
> have visibility information in their indexes.
> 
> The planner is pretty smart, but if you're going to hit a large % of
> the table anyway, it switches to sequential scans since it will have
> to retreive the majority of the table anyway.

So, I am fine when I trust the Postgresql planner :) Because speed wise
I see no difference that 8.3 would be slower than 8.2

-- 
[ Clemens Schwaighofer                      -----=====:::::~ ]
[ IT Engineer/Manager                                        ]
[ E-Graphics Communications, TEQUILA\ Japan IT Group         ]
[                6-17-2 Ginza Chuo-ku, Tokyo 104-8167, JAPAN ]
[ Tel: +81-(0)3-3545-7706            Fax: +81-(0)3-3545-7343 ]
[ http://www.tequila.jp                                      ]

Attachment:
signature.asc

Description: OpenPGP digital signature