from pg_settings:
name | current_setting
version | PostgreSQL 9.0.3 on x86_64-unknown-linux-gnu, compiled by GCC gcc (GCC) 4.1.2 20080704 (Red Hat 4.1.2-46), 64-bit
bytea_output | escape
checkpoint_completion_target | 0.9
checkpoint_segments | 24
effective_cache_size | 24GB
effective_io_concurrency | 2
lc_collate | en_US.utf8
lc_ctype | en_US.utf8
listen_addresses | *
log_checkpoints | on
log_connections | on
log_disconnections | on
log_hostname | on
log_line_prefix | %t
logging_collector | on
maintenance_work_mem | 256MB
max_connections | 120
max_stack_depth | 2MB
port | 5432
server_encoding | UTF8
shared_buffers | 4GB
synchronous_commit | off
tcp_keepalives_idle | 180
TimeZone | US/Eastern
track_activity_query_size | 8192
wal_buffers | 16MB
wal_writer_delay | 330ms
work_mem | 512MB
This is a dual dual-core 64bit intel machine (hyperthreaded so 8 logical cpus) with 24GB of memory running basically just the db against a raid 5 diskarray.
Tim
On Tue, Mar 15, 2011 at 2:23 PM, Timothy Garnett <tgarnett@xxxxxxxxxxx> wrote:
Hi all,
We added an index to a table (to support some different functionality) then ran into cases where the new index (on month, bl_number in the schema below) made performance of some existing queries ~20,000 times worse. While we do have a workaround (using a CTE to force the proper index to be used) that gets us down to ~twice the original performance (i.e. without the new index), I'm wondering if there's a better workaround that can get us closer to the original performance. It also seems like kind of a strange case so I'm wondering if there's something weird going on in the optimizer. The # of rows estimates are pretty accurate so it's guessing that about right, but the planner seems to be putting way too much weight on using a sorted index vs. looking up. This is all after an analyze.
Near as I can guess the planner seems to be weighting scanning what should be an expected 100k rows (though in practice it will have to do about 35 million, because the assumption of independence between columns is incorrect) given an expected selectivity of 48K rows out of 45 million over scanning ~48k rows (using the index) and doing a top-n 100 sort on them (actual row count is 43k so pretty close on that). Even giving the optimizer the benefit of column independence I don't see how that first plan could possibly come out ahead. It would really help if explain would print out the number of rows it expects to scan and analyze would print out the number of rows it actually scanned (instead of just the number that matched the filter/limit), see the expensive query explain analyze output below.
At the bottom I have some info on the contents and probability.
## The original Query:
explain analyze
SELECT customs_records.* FROM "customs_records" WHERE (((buyer_id IS NOT NULL AND buyer_id IN (1172672,1570888,1336461,1336464,1336471,1189285,1336463,1336460,1654709,1000155646,1191114,1336480,1336479,1000384928,1161787,1811495,1507188,1159339,1980765,1200258,1980770,1980768,1980767,1980769,1980766,1980772,1000350850,1000265917,1980764,1980761,1170019,1980762,1184356,1985585)))) ORDER BY month DESC LIMIT 100 OFFSET 0;
-----------------------------------
Limit (cost=184626.64..184626.89 rows=100 width=908) (actual time=102.630..102.777 rows=100 loops=1)
-> Sort (cost=184626.64..184748.19 rows=48623 width=908) (actual time=102.628..102.683 rows=100 loops=1)
Sort Key: month
Sort Method: top-N heapsort Memory: 132kB
-> Bitmap Heap Scan on customs_records (cost=1054.22..182768.30 rows=48623 width=908) (actual time=5.809..44.832 rows=43352 loops=1)
Recheck Cond: ((buyer_id IS NOT NULL) AND (buyer_id = ANY ('{1172672,1570888,1336461,1336464,1336471,1189285,1336463,1336460,1654709,1000155646,1191114,1336480,1336479,1000384928,1161787,1811495,1507188,1159339,1980765,1200258,1980770,1980768,1980767,1980769,1980766,1980772,1000350850,1000265917,1980764,1980761,1170019,1980762,1184356,1985585}'::integer[])))
-> Bitmap Index Scan on index_customs_records_on_buyer_id (cost=0.00..1042.07 rows=48623 width=0) (actual time=4.588..4.588 rows=43352 loops=1)
Index Cond: ((buyer_id IS NOT NULL) AND (buyer_id = ANY ('{1172672,1570888,1336461,1336464,1336471,1189285,1336463,1336460,1654709,1000155646,1191114,1336480,1336479,1000384928,1161787,1811495,1507188,1159339,1980765,1200258,1980770,1980768,1980767,1980769,1980766,1980772,1000350850,1000265917,1980764,1980761,1170019,1980762,1184356,1985585}'::integer[])))
Total runtime: 102.919 ms
## Same query after adding the new index
### NOTE - it would be very useful here if explain would print out the number of rows it expects to scan in the index and analyze dumped out the number of rows actually scanned. Instead analyze is printing the rows actually outputed and explain appears to be outputting the number of rows expected to match the filter ignoring the limit... (it exactly matches the row count in the query above)
##
explain analyze
SELECT customs_records.* FROM "customs_records" WHERE (((buyer_id IS NOT NULL AND buyer_id IN (1172672,1570888,1336461,1336464,1336471,1189285,1336463,1336460,1654709,1000155646,1191114,1336480,1336479,1000384928,1161787,1811495,1507188,1159339,1980765,1200258,1980770,1980768,1980767,1980769,1980766,1980772,1000350850,1000265917,1980764,1980761,1170019,1980762,1184356,1985585)))) ORDER BY month DESC LIMIT 100 OFFSET 0;
--------------------------------------------
Limit (cost=0.00..161295.58 rows=100 width=908) (actual time=171344.185..3858893.743 rows=100 loops=1)
-> Index Scan Backward using index_customs_records_on_month_and_bl_number on customs_records (cost=0.00..78426750.74 rows=48623 width=908) (actual time=171344.182..3858893.588 rows=100 loops=1)
Filter: ((buyer_id IS NOT NULL) AND (buyer_id = ANY ('{1172672,1570888,1336461,1336464,1336471,1189285,1336463,1336460,1654709,1000155646,1191114,1336480,1336479,1000384928,1161787,1811495,1507188,1159339,1980765,1200258,1980770,1980768,1980767,1980769,1980766,1980772,1000350850,1000265917,1980764,1980761,1170019,1980762,1184356,1985585}'::integer[])))
Total runtime: 3858893.908 ms
############################################################
My workaround is to use a CTE query to force the planner to not use the month index for sorting (using a subselect is not enough since the planner is too smart for that). However this is still twice as slow as the original query...
############################################################
explain analyze
with foo as (select customs_records.* FROM "customs_records" WHERE (((buyer_id IS NOT NULL AND buyer_id IN (1172672,1570888,1336461,1336464,1336471,1189285,1336463,1336460,1654709,1000155646,1191114,1336480,1336479,1000384928,1161787,1811495,1507188,1159339,1980765,1200258,1980770,1980768,1980767,1980769,1980766,1980772,1000350850,1000265917,1980764,1980761,1170019,1980762,1184356,1985585))))) select * from foo order by month desc limit 100 ;
-----------------------------------------------------------
Limit (cost=185599.10..185599.35 rows=100 width=5325) (actual time=196.968..197.105 rows=100 loops=1)
CTE foo
-> Bitmap Heap Scan on customs_records (cost=1054.22..182768.30 rows=48623 width=908) (actual time=5.765..43.489 rows=43352 loops=1)
Recheck Cond: ((buyer_id IS NOT NULL) AND (buyer_id = ANY ('{1172672,1570888,1336461,1336464,1336471,1189285,1336463,1336460,1654709,1000155646,1191114,1336480,1336479,1000384928,1161787,1811495,1507188,1159339,1980765,1200258,1980770,1980768,1980767,1980769,1980766,1980772,1000350850,1000265917,1980764,1980761,1170019,1980762,1184356,1985585}'::integer[])))
-> Bitmap Index Scan on index_customs_records_on_buyer_id (cost=0.00..1042.07 rows=48623 width=0) (actual time=4.544..4.544 rows=43352 loops=1)
Index Cond: ((buyer_id IS NOT NULL) AND (buyer_id = ANY ('{1172672,1570888,1336461,1336464,1336471,1189285,1336463,1336460,1654709,1000155646,1191114,1336480,1336479,1000384928,1161787,1811495,1507188,1159339,1980765,1200258,1980770,1980768,1980767,1980769,1980766,1980772,1000350850,1000265917,1980764,1980761,1170019,1980762,1184356,1985585}'::integer[])))
-> Sort (cost=2830.80..2952.35 rows=48623 width=5325) (actual time=196.966..197.029 rows=100 loops=1)
Sort Key: foo.month
Sort Method: top-N heapsort Memory: 132kB
-> CTE Scan on foo (cost=0.00..972.46 rows=48623 width=5325) (actual time=5.770..153.322 rows=43352 loops=1)
Total runtime: 207.282 ms
#### Table information
Table information - the schema is the table below (with some columns removed for succinctness). There are ~45 million rows, the rows are also fairly wide about 80 columns total. buyer_id is null ~30% of the time (as is supplier_id). A given buyer id maps to between 1 and ~100,000 records (in a decreasing distribution, about 1 million unique buyer id values). Supplier_id is similar. Note buyer_id and month columns are not always independent (for some buyer_ids there is a strong correlation as in this case where the buyer_ids are associated with only older months, though for others there isn't), though even so I'm still not clear on why it would pick the plan that it does. We can consider these table never updated or inserted into (updates are done in a new db offline that is periodically swapped in).
Table "public.customs_records"
Column | Type | Modifiers
--------------------------+------------------------+--------------------------------------------------------------
id | integer | not null default nextval('customs_records_id_seq'::regclass)
....
bl_number | character varying(16) |
....
month | date |
....
buyer_id | integer |
...
supplier_id | integer |
...
Indexes:
"customs_records_pkey" PRIMARY KEY, btree (id) WITH (fillfactor=100)
"index_customs_records_on_month_and_bl_number" UNIQUE, btree (month, bl_number) WITH (fillfactor=100)
"index_customs_records_on_buyer_id" btree (buyer_id) WITH (fillfactor=100) WHERE buyer_id IS NOT NULL
"index_customs_records_on_supplier_id_and_buyer_id" btree (supplier_id, buyer_id) WITH (fillfactor=100) CLUSTER
db version =>
PostgreSQL 9.0.3 on x86_64-unknown-linux-gnu, compiled by GCC gcc (GCC) 4.1.2 20080704 (Red Hat 4.1.2-46), 64-bit
(enterprise db build)
ubuntu 8.04 LTS is the host