Re: PG8.2.1 choosing slow seqscan over idx scan

"Jeremy Haile" <jhaile@xxxxxxxxxxx> · Wed, 17 Jan 2007 10:09:22 -0500

Thanks for the great info Chad.  I'm learning a lot from this thread!

> 347434 rows * 156 bytes = 52MB (reasonable it could be held in your
> shared buffers, which makes Tom's suggestion very plausible, the 
> index scan may not be cheaper -- because it is all cached)

Maybe - I tried running the same query for an older time range that is
less likely to be cached.  The index scan took longer than my previous
example, but still only took 16 seconds, compared to the 87 seconds
required to seqscan the table.  When I can, I'll restart the machine and
run a comparison again to get a "pure" test.

> One of the reasons why the sequential scan is slower is because the
> optimizer doesn't know the data you are requesting is sitting in the
> cache (and it is VERY unlikely you have the entire table in cache, 
> unless it is a heavily used table or very small table, which it's probably 
> not).  

This is a large table (3 million rows).  Rows are typically inserted in
date order, although large numbers of rows are deleted every night. 
Basically, this table contains a list of transactions in a rolling time
window.  So inserts happen throughout the day, and then a day's worth of
old rows are deleted every night.  The most accessed rows are going to
be today's rows, which is a small subset of the overall table. (maybe
14%)

> One thing to be careful of here is that you really need to consider what
> is  the primary use of the table, and what are the major queries you will be
> launching against it.  But you could improve the correlation by
> rebuilding the table ordered by the transaction_date column, but it may screw up
> other range scans.  

Date is almost always a criteria in scans of this table.  As mentioned
earlier, the table is naturally built in date order - so would
rebuilding the table help?  Is it possible that even though I'm
inserting in date order, since I delete rows so often the physical
correlation would get disrupted as disk pages are reused?  Perhaps
clustering on the transaction_date index and periodically running
"cluster" would help?  Does vacuum full help with this at all?

> Another option is partitioning.  I wouldn't do any of this
> stuff, until you find out the last tweak you made still holds true, give
> it a few days, perhaps test it after a clean reboot of the server.

Yeah - partitioning makes a lot of sense and I've thought about doing
this in the past.  Although I do run queries that cross multiple days,
most of my queries only look at today's data, so the physical disk
organization would likely be much better with a partitioned table setup.
 Also, since I usually delete old data one day at a time, I could simply
drop the old day's partition.  This would make vacuuming much less of an
issue.  

But I won't be making any changes immediately, so I'll continue to run
tests given your advice.

Thanks again,
Jeremy Haile