Please don't reply to previous messages to start new threads. This makes it harder to find stuff in the archives and may keep people from noticing your message. On Wed, May 17, 2006 at 08:54:52 -0700, "Craig A. James" <cjames@xxxxxxxxxxxxxxxx> wrote: > Here's a "corner case" that might interest someone. It tripped up one of > our programmers. > > We have a table with > 10 million rows. The ID column is indexed, the > table has been vacuum/analyzed. Compare these two queries: > > select * from tbl where id >= 10000000 limit 1; > select * from tbl where id >= 10000000 order by id limit 1; > > The first takes 4 seconds, and uses a full table scan. The second takes 32 > msec and uses the index. Details are below. I suspect it wasn't intended to be a full table scan. But rather a sequential scan until it found a matching row. If the data in the table is ordered by by id, this strategy may not work out well. Where as if the data is randomly ordered, it would be expected to find a match quickly. Have you analyzed the table recently? If the planner has bad stats on the table, that is going to make it more likely to choose a bad plan. > I understand why the planner makes the choices it does -- the "id > > 10000000" isn't very selective and under normal circumstances a full table > scan is probably the right choice. But the "limit 1" apparently doesn't > alter the planner's strategy at all. We were surprised by this. > > Adding the "order by" was a simple solution. > > Craig > > > > pg=> explain analyze select url, url_digest from url_queue where priority > >= 10000000 limit 1; > QUERY PLAN > ------------------------------------------------------------------------------------------ > Limit (cost=0.00..0.65 rows=1 width=108) (actual time=4036.113..4036.117 > rows=1 loops=1) > -> Seq Scan on url_queue (cost=0.00..391254.35 rows=606176 width=108) > (actual time=4036.101..4036.101 rows=1 loops=1) > Filter: (priority >= 10000000) > Total runtime: 4036.200 ms > (4 rows) > > pg=> explain analyze select url, url_digest from url_queue where priority > >= 10000000 order by priority limit 1; > QUERY PLAN > -------------------------------------------------------------------------------------- > Limit (cost=0.00..2.38 rows=1 width=112) (actual time=32.445..32.448 > rows=1 loops=1) > -> Index Scan using url_queue_priority on url_queue > (cost=0.00..1440200.41 rows=606176 width=112) (actual time=32.434..32.434 > rows=1 loops=1) > Index Cond: (priority >= 10000000) > Total runtime: 32.566 ms > > ---------------------------(end of broadcast)--------------------------- > TIP 2: Don't 'kill -9' the postmaster