Here is the results of 'vacuum analyse verbose' on my table: INFO: vacuuming "public.jbpm_taskinstance" INFO: scanned index "jbpm_taskinstance_pkey" to remove 928153 row versions DETAIL: CPU 0.70s/2.40u sec elapsed 46.49 sec. INFO: scanned index "idx_tskinst_tminst" to remove 928153 row versions DETAIL: CPU 0.78s/2.34u sec elapsed 88.99 sec. INFO: scanned index "idx_tskinst_slinst" to remove 928153 row versions DETAIL: CPU 0.63s/2.37u sec elapsed 92.54 sec. INFO: scanned index "idx_taskinst_tokn" to remove 928153 row versions DETAIL: CPU 0.99s/2.30u sec elapsed 110.29 sec. INFO: scanned index "idx_taskinst_tsk" to remove 928153 row versions DETAIL: CPU 0.92s/2.63u sec elapsed 89.16 sec. INFO: scanned index "idx_pooled_actor" to remove 928153 row versions DETAIL: CPU 0.32s/1.65u sec elapsed 2.56 sec. INFO: scanned index "idx_task_actorid" to remove 928153 row versions DETAIL: CPU 0.09s/1.88u sec elapsed 2.69 sec. INFO: "jbpm_taskinstance": removed 928153 row versions in 13685 pages DETAIL: CPU 0.84s/0.82u sec elapsed 26.42 sec. INFO: index "jbpm_taskinstance_pkey" now contains 7555748 row versions in 62090 pages DETAIL: 927985 index row versions were removed. 0 index pages have been deleted, 0 are currently reusable. CPU 0.00s/0.00u sec elapsed 0.03 sec. INFO: index "idx_tskinst_tminst" now contains 7555748 row versions in 65767 pages Afterwards I ran a 'vacuum full verbose' INFO: vacuuming "public.jbpm_taskinstance" INFO: "jbpm_taskinstance": found 0 removable, 7555748 nonremovable row versions in 166156 pages DETAIL: 0 dead row versions cannot be removed yet. Nonremovable row versions range from 88 to 209 bytes long. There were 8470471 unused item pointers. Total free space (including removable row versions) is 208149116 bytes. 9445 pages are or will become empty, including 0 at the end of the table. 119104 pages containing 206008504 free bytes are potential move destinations. CPU 2.44s/1.60u sec elapsed 127.89 sec. INFO: index "jbpm_taskinstance_pkey" now contains 7555748 row versions in 62090 pages DETAIL: 0 index row versions were removed. 0 index pages have been deleted, 0 are currently reusable. CPU 0.87s/2.16u sec elapsed 120.81 sec. INFO: index "idx_tskinst_tminst" now contains 7555748 row versions in 65767 pages DETAIL: 0 index row versions were removed. 26024 index pages have been deleted, 26024 are currently reusable. CPU 0.79s/1.95u sec elapsed 103.52 sec. INFO: index "idx_tskinst_slinst" now contains 7555748 row versions in 56031 pages DETAIL: 0 index row versions were removed. 28343 index pages have been deleted, 28343 are currently reusable. CPU 0.62s/1.93u sec elapsed 99.21 sec. INFO: index "idx_taskinst_tokn" now contains 7555748 row versions in 65758 pages DETAIL: 0 index row versions were removed. 26012 index pages have been deleted, 26012 are currently reusable. CPU 1.10s/2.18u sec elapsed 108.29 sec. INFO: index "idx_taskinst_tsk" now contains 7555748 row versions in 64516 pages DETAIL: 0 index row versions were removed. 0 index pages have been deleted, 0 are currently reusable. CPU 1.01s/1.73u sec elapsed 64.73 sec. INFO: index "idx_pooled_actor" now contains 7555748 row versions in 20896 pages DETAIL: 0 index row versions were removed. 136 index pages have been deleted, 136 are currently reusable. CPU 0.26s/1.57u sec elapsed 3.01 sec. INFO: index "idx_task_actorid" now contains 7555748 row versions in 20885 pages DETAIL: 0 index row versions were removed. 121 index pages have been deleted, 121 are currently reusable. CPU 0.23s/1.52u sec elapsed 2.77 sec. INFO: "jbpm_taskinstance": moved 1374243 row versions, truncated 166156 to 140279 pages DETAIL: CPU 26.50s/138.35u sec elapsed 735.02 sec. INFO: index "jbpm_taskinstance_pkey" now contains 7555748 row versions in 62090 pages DETAIL: 1374243 index row versions were removed. 0 index pages have been deleted, 0 are currently reusable. CPU 1.04s/1.38u sec elapsed 117.72 sec. INFO: index "idx_tskinst_tminst" now contains 7555748 row versions in 65767 pages DETAIL: 1374243 index row versions were removed. 26024 index pages have been deleted, 26024 are currently reusable. CPU 1.37s/1.01u sec elapsed 123.56 sec. INFO: index "idx_tskinst_slinst" now contains 7555748 row versions in 56031 pages DETAIL: 1374243 index row versions were removed. 28560 index pages have been deleted, 28560 are currently reusable. CPU 1.20s/1.27u sec elapsed 105.67 sec. INFO: index "idx_taskinst_tokn" now contains 7555748 row versions in 65758 pages DETAIL: 1374243 index row versions were removed. 26012 index pages have been deleted, 26012 are currently reusable. CPU 1.29s/0.96u sec elapsed 112.62 sec. INFO: index "idx_taskinst_tsk" now contains 7555748 row versions in 64516 pages DETAIL: 1374243 index row versions were removed. 0 index pages have been deleted, 0 are currently reusable. CPU 1.48s/1.12u sec elapsed 70.56 sec. INFO: index "idx_pooled_actor" now contains 7555748 row versions in 25534 pages DETAIL: 1374243 index row versions were removed. 3769 index pages have been deleted, 3769 are currently reusable. CPU 0.48s/0.82u sec elapsed 6.89 sec. INFO: index "idx_task_actorid" now contains 7555748 row versions in 25545 pages DETAIL: 1374243 index row versions were removed. 3790 index pages have been deleted, 3790 are currently reusable. CPU 0.37s/1.24u sec elapsed 7.93 sec. INFO: vacuuming "pg_toast.pg_toast_560501" INFO: "pg_toast_560501": found 0 removable, 0 nonremovable row versions in 0 pages DETAIL: 0 dead row versions cannot be removed yet. Nonremovable row versions range from 0 to 0 bytes long. There were 0 unused item pointers. Total free space (including removable row versions) is 0 bytes. 0 pages are or will become empty, including 0 at the end of the table. 0 pages containing 0 free bytes are potential move destinations. CPU 0.00s/0.00u sec elapsed 0.00 sec. INFO: index "pg_toast_560501_index" now contains 0 row versions in 1 pages DETAIL: 0 index pages have been deleted, 0 are currently reusable. CPU 0.00s/0.00u sec elapsed 0.00 sec. Query returned successfully with no result in 1910948 ms. So it looks to me like it was bloated by about 25%. Performance seems better for 'some' values of column B (I believe all the vacuumed rows were at the beginning), but I've still done the same query which can take over a minute in some cases - depending on caching etc. It is still choosing sequential scan. Here are the queries you requested: EXPLAIN ANALYZE SELECT count(*) FROM jbpm_taskinstance WHERE actorid_ IS NULL; "Aggregate (cost=234297.69..234297.70 rows=1 width=0) (actual time=109648.328..109648.330 rows=1 loops=1)" " -> Seq Scan on jbpm_taskinstance (cost=0.00..215836.48 rows=7384484 width=0) (actual time=13.684..99002.279 rows=7315726 loops=1)" " Filter: (actorid_ IS NULL)" "Total runtime: 109648.403 ms" EXPLAIN ANALYZE SELECT count(*) FROM jbpm_taskinstance WHERE pooledactor_='21'; "Aggregate (cost=180929.77..180929.78 rows=1 width=0) (actual time=6739.215..6739.217 rows=1 loops=1)" " -> Bitmap Heap Scan on jbpm_taskinstance (cost=23839.23..178127.84 rows=1120769 width=0) (actual time=633.808..5194.672 rows=1020084 loops=1)" " Recheck Cond: ((pooledactor_)::text = '21'::text)" " -> Bitmap Index Scan on idx_pooled_actor (cost=0.00..23559.04 rows=1120769 width=0) (actual time=612.546..612.546 rows=1020084 loops=1)" " Index Cond: ((pooledactor_)::text = '21'::text)" "Total runtime: 6739.354 ms" EXPLAIN ANALYZE SELECT count(*) FROM jbpm_taskinstance WHERE actorid_ IS NULL AND pooledactor_='21'; "Aggregate (cost=180859.91..180859.92 rows=1 width=0) (actual time=4358.316..4358.318 rows=1 loops=1)" " -> Bitmap Heap Scan on jbpm_taskinstance (cost=23832.88..178121.49 rows=1095365 width=0) (actual time=377.206..2929.735 rows=1020062 loops=1)" " Recheck Cond: ((pooledactor_)::text = '21'::text)" " Filter: (actorid_ IS NULL)" " -> Bitmap Index Scan on idx_pooled_actor (cost=0.00..23559.04 rows=1120769 width=0) (actual time=373.160..373.160 rows=1020084 loops=1)" " Index Cond: ((pooledactor_)::text = '21'::text)" "Total runtime: 4366.766 ms" Many thanks, David -----Original Message----- From: Greg Stark [mailto:greg.stark enterprisedb.com] On Behalf Of Gregory Stark Sent: 02 September 2008 10:45 To: Guillaume Cottenceau Cc: David West; pgsql-performance@xxxxxxxxxxxxxx Subject: Re: limit clause breaks query planner? >>>> "Limit (cost=0.00..3.68 rows=15 width=128) (actual time=85837.043..85896.140 rows=15 loops=1)" >>>> " -> Seq Scan on my_table this_ (cost=0.00..258789.88 rows=1055580 width=128) (actual time=85837.038..85896.091 rows=15 loops=1)" >>>> " Filter: ((A IS NULL) AND ((B)::text = '21'::text))" >>>> "Total runtime: 85896.214 ms" Postgres does collect and use statistics about what fraction of the "A" column is null. It also collects and uses statistics about what fraction of the "B" column is 21 (using a histogram). And it does take the LIMIT into account. I think the other poster might well be right about this table being extremely bloated. You could test by running and posting the results of: VACUUM VERBOSE my_table What it doesn't collect is where in the table those records are -- so if there are a lot of them then it might use a sequential scan regardless of whether they're at the beginning or end of the table. That seems unlikely to be the problem though. The other thing it doesn't collect is how many of the B=21 records have null As. So if a large percentage of the table has A as null then it will assume that's true for the B=21 records and if there are a lot of B=21 records then it will assume a sequential scan will find matches quickly. If in fact the two columns are highly correlated and B=21 records almost never have A null whereas records with other values of B have lots of null values then Postgres might make a bad decision here. Also, it only has the statitics for B=21 via a histogram. If the distribution of B is highly skewed so that, for example values between 20 and 25 are very common but B=21 happens to be quite rare then Postgres might get a bad estimate here. You could improve this by raising the statistics target for the B column and re-analyzing. That brings up another question -- when was the last time this table was analyzed? What estimates and actual results does postgres get for simple queries like: EXPLAIN ANALYZE SELECT count(*) FROM my_table WHERE A IS NULL; EXPLAIN ANALYZE SELECT count(*) FROM my_table WHERE B=21; EXPLAIN ANALYZE SELECT count(*) FROM my_table WHERE A IS NULL AND B=21; -- Gregory Stark EnterpriseDB http://www.enterprisedb.com Ask me about EnterpriseDB's PostGIS support!