On Wed, 2007-08-01 at 15:56 -0500, Mason Hale wrote: > On a 8.1.9 version database that has been recently vacuumed and > analyzed, I'm seeing some dramatic performance degradation if a limit > clause is included in the query. This seems counter-intuitive to me. > > Here's the query and explain plan WITH the LIMIT clause: > > SELECT * > FROM topic_feed > WHERE topic_id = 106947234 > ORDER BY score DESC > LIMIT 25 > > Limit (cost=0.00..651.69 rows=25 width=29) (actual > time=72644.652..72655.029 rows=25 loops=1) > -> Index Scan Backward using topic_feed_score_index on topic_feed > (cost=0.00..21219.08 rows=814 width=29) (actual > time=72644.644..72654.855 rows=25 loops=1) > Filter: (topic_id = 106947234) > Total runtime: 72655.733 ms > > ============== > > and now WITHOUT the LIMIT clause: > > SELECT * > FROM topic_feed > WHERE topic_id = 106947234 > ORDER BY score DESC > > Sort (cost=1683.75..1685.78 rows=814 width=29) (actual > time=900.553..902.267 rows=492 loops=1) > Sort Key: score > -> Bitmap Heap Scan on topic_feed (cost=7.85..1644.40 rows=814 > width=29) (actual time=307.900..897.993 rows=492 loops=1) > Recheck Cond: (topic_id = 106947234) > -> Bitmap Index Scan on > index_topic_feed_on_topic_id_and_feed_id (cost=0.00..7.85 rows=814 > width=0) (actual time=213.205..213.205 rows=2460 loops=1) > Index Cond: (topic_id = 106947234) > Total runtime: 904.049 ms Let's call those plan 1 and plan 2. In plan 1, the planner thinks that it will find 25 tuples matching that topic_id quickly during the backwards index scan on topic_feed_score_index. Instead, it looks like it has to go through a lot of tuples before it finds the necessary 25. In plan 2, it does it backward: first it finds the tuples matching the topic_id, then it sorts those and returns the first 25. If I'm reading the plans correctly (plan 2, in particular), there are 2460 tuples matching the topic_id, but only 492 are visible. I don't know why there are so many invisible tuples that still exist in the index after your VACUUM. Are there a lot of UPDATEs/DELETEs on tuples with that topic_id between when you run the VACUUM and when you run the SELECT? Do you have big transactions open when running the VACUUM? Also, how many tuples are in the table overall? The stats look fairly accurate: it's only off on the estimate of matching rows by about a factor of two (which isn't terribly bad), 814 estimated versus 492 that are actually visible and match the topic_id. If I had to guess, I'd say you either: (a) have too many invisible tuples - you can correct this with more frequent VACUUMing (b) the distribution of tuples with matching topic_id is not even, and many of the tuples with a matching topic_id happen to have a low score - this is harder to fix, because the stats collector can't detect it. You'd probably have to coerce it to use plan 2. Regards, Jeff Davis ---------------------------(end of broadcast)--------------------------- TIP 2: Don't 'kill -9' the postmaster