On 29 September 2010 10:03, Mark Kirkwood <mark.kirkwood@xxxxxxxxxxxxxxx> > Yeah, I think the idea of trying to have a few smaller indexes for the 'hot' > customers is a good idea. However I am wondering if just using single column > indexes and seeing if the bitmap scan/merge of smaller indexes is actually > more efficient is worth testing - i.e: > > acc_trans(trans_type); > acc_trans(created); > acc_trans(customer_id); My gut feeling tells me that it's not a good idea - consider that we want to pull out 20 rows from a 60M table. If I'm not mistaken, with bitmapping it's needed to do operations on the whole indexes - 60M bits is still 7.5 megabytes. Well, I suppose that nowadays it's relatively fast to bitmap 7.5 Mb of memory, but probably some orders of magnitude more than the few milliseconds it takes to pick out the 20 rows directly from the specialized index. Well, why rely on gut feelings - when things can be measured. I didn't take those figures from the production database server though, but at least it gives a hint on what to expect. First, using the three-key index for "select * from acc_trans where customer_id=? and trans_type=? order by created desc limit 20". I chose one of the users with most transactions, and I tested with the most popular transaction type as well as one transaction type where he has just a handful of transactions. Both took significantly less than 1 ms to run. Then I deleted all indexes and created the three suggested indexes. Using the popular transaction type, it took 123 ms. Well, that's 500 times as much time, but still acceptable. Here is the query plan: => explain analyze select * from acc_trans where customer_id=67368 and trans_type=8 order by created desc limit 20; QUERY PLAN --------------------------------------------------------------------------------------------------------------------------------------------------------------------- Limit (cost=1486.23..1486.28 rows=20 width=200) (actual time=123.685..123.687 rows=3 loops=1) -> Sort (cost=1486.23..1486.34 rows=43 width=200) (actual time=123.684..123.685 rows=3 loops=1) Sort Key: created Sort Method: quicksort Memory: 25kB -> Bitmap Heap Scan on acc_trans (cost=1313.90..1485.08 rows=43 width=200) (actual time=121.350..123.669 rows=3 loops=1) Recheck Cond: ((trans_type = 8) AND (customer_id = 67368)) -> BitmapAnd (cost=1313.90..1313.90 rows=43 width=0) (actual time=120.342..120.342 rows=0 loops=1) -> Bitmap Index Scan on account_transaction_on_type (cost=0.00..256.31 rows=13614 width=0) (actual time=12.200..12.200 rows=43209 loops=1) Index Cond: (trans_type = 8) -> Bitmap Index Scan on account_transaction_on_user (cost=0.00..1057.31 rows=56947 width=0) (actual time=104.578..104.578 rows=59133 loops=1) Index Cond: (users_id = 67368) Total runtime: 123.752 ms (12 rows) With the most popular trans type it chose another plan and it took more than 3s (totally unacceptable): => explain analyze select * from acc_trans where customer_id=67368 and trans_type=6 order by created desc limit 20; QUERY PLAN --------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------- Limit (cost=0.00..44537.82 rows=20 width=200) (actual time=1746.288..3204.029 rows=20 loops=1) -> Index Scan Backward using account_transaction_on_created on acc_trans (cost=0.00..55402817.90 rows=24879 width=200) (actual time=1746.285..3204.021 rows=20 loops=1) Filter: ((customer_id = 67368) AND (trans_type = 6)) Total runtime: 3204.079 ms (4 rows) Although this customer has several tens of thousands of transactions, dropping the three-key-index and use an index on users_id,created is clearly a better option than running out of memory: => explain analyze select * from acc_trans where customer_id=67368 and trans_type=8 order by created desc limit 20; QUERY PLAN ------------------------------------------------------------------------------------------------------------------------------------------------------------------- Limit (cost=0.00..98524.88 rows=20 width=200) (actual time=0.669..197.012 rows=3 loops=1) -> Index Scan Backward using account_transaction_by_user_ts on acc_trans (cost=0.00..211828.49 rows=43 width=200) (actual time=0.668..197.006 rows=3 loops=1) Index Cond: (customer_id = 67368) Filter: (trans_type = 8) Total runtime: 197.066 ms (5 rows) 0.2s sounds acceptable, it's just that this may be just a small part of building the web page, so it adds up ... and probably (I didn't check how profitable this customer is) this is probably exactly the kind of customer we wouldn't want to get annoyed with several seconds page load time. -- Sent via pgsql-performance mailing list (pgsql-performance@xxxxxxxxxxxxxx) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-performance