[ Please don't re-quote the entire damn thread in each followup. Have some respect for your readers' time, and assume that they have already seen the previous traffic, or could go look it up if they haven't. The point of quoting at all is just to quickly remind people where we are in the discussion. ] Jake Nielsen <jake.k.nielsen@xxxxxxxxx> writes: > So... it seems that setting the userId to one that has less rows in the > table results in the index actually being used... > EXPLAIN (ANALYZE, BUFFERS) SELECT * FROM SyncerEvent WHERe userId = > '57d35db7353b0d627c0e592f' AND ID > 12468 ORDER BY ID LIMIT 4000; It looks from the numbers floating around in this thread that the userId used in your original query actually matches about 50% of the table. That would make it unsurprising that the planner doesn't want to use an index. A rule of thumb is that a seqscan is going to be cheaper than an indexscan if your query retrieves, or even just has to fetch, more than a few percent of the table. Now, given the existence of an index on (userID, ID) --- in that order --- I would expect the planner to want to use that index for a query shaped exactly as you show above. Basically, it knows that that just requires starting at the ('57d35db7353b0d627c0e592f', 12468) position in the index and scanning forward for 4000 index entries; no extraneous table rows will be fetched at all. If you increased the LIMIT enough, it'd go over to a seqscan-and-sort to avoid doing so much random access to the table, but I'd think the crossover point for that is well above 4000 out of 3.3M rows. However, as soon as you add any other unindexable conditions, the situation changes because rows that fail the additional conditions represent useless fetches. Now, instead of fetching 4000 rows using the index, it's fetching 4000 times some multiplier. It's hard to tell for sure given the available info, but I think that the extra inequalities in your original query reject a pretty sizable proportion of rows, resulting in the indexscan approach needing to fetch a great deal more than 4000 rows, making it look to be more expensive than a seqscan. I'm not sure why it's preferring the pkey index to the one on (userID, ID), but possibly that has something to do with that index being better correlated to the physical table order, resulting in a prediction of less random I/O when using that index. So the bottom line is that given your data statistics, there may well be no really good plan for your original query. It just requires fetching a lot of rows, and indexes can't help very much. If you say "well yeah, but it seems to perform fine when I force it to use that index anyway", the answer may be that you need to adjust random_page_cost. The default value is OK for tables that are mostly sitting on spinning rust, but if your database is RAM-resident or SSD-resident you probably want a value closer to 1. regards, tom lane -- Sent via pgsql-performance mailing list (pgsql-performance@xxxxxxxxxxxxxx) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-performance