Regarding this earlier thread: https://www.postgresql.org/message-id/flat/20160524173914.GA11880%40telsasoft.com#20160524173914.GA11880@xxxxxxxxxxxxx On Tue, May 24, 2016 at 10:39 AM, Justin Pryzby <pryzby@xxxxxxxxxxxxx> wrote: > Summary: Non-unique btree indices are returning CTIDs for rows with same > value of indexed column not in logical order, imposing a high performance > penalty. I have to point out that by "logical" I clearly meant "physical", hopefully nobody was too misled.. On Sun, Jun 05, 2016 at 12:28:47PM -0700, Jeff Janes wrote: > On Sun, Jun 5, 2016 at 9:03 AM, Tom Lane <tgl@xxxxxxxxxxxxx> wrote: > > Claudio Freire <klaussfreire@xxxxxxxxx> writes: > >> So correlated index scans look extra favourable vs bitmap index scans > >> because bitmap heap scans consider random page costs sans correlation > >> effects (even though correlation applies to bitmap heap scans as > >> well). > > > > Really? How? The index ordering has nothing to do with the order in > > which heap tuples will be visited. > > It is not the order itself, but the density. > > If the index is read in a range scan (as opposed to =ANY scan), and > the index lead column is correlated with the table ordering, then the > parts of the table that need to be visited will be much denser than if > there were no correlation. But Claudio is saying that this is not > being accounted for. I didn't completely understand Claudio/Jeff here, and not sure if we're on the same page. For queries on these tables, the index scan was very slow, due to fragmented index on non-unique column, and seq scan would have been (was) faster (even if it means reading 70GB and filtering out 6 of 7 days' data). That was resolved by added a nightly reindex job (.. which sometimes competes with other maintenance and has trouble running every table every night). But I did find that someone else had previously reported this problem (in a strikingly similar context and message, perhaps clearer than mine): https://www.postgresql.org/message-id/flat/520D6610.8040907%40emulex.com#520D6610.8040907@xxxxxxxxxx I also found this older thread: https://www.postgresql.org/message-id/flat/n6cmpug13b9rk1srebjvhphg0lm8dou1kn%404ax.com#n6cmpug13b9rk1srebjvhphg0lm8dou1kn@xxxxxxx There was mention of a TODO item: * Compute index correlation on CREATE INDEX and ANALYZE, use it for index * scan cost estimation .. but perhaps I misunderstand and that's long since resolved ? Justin -- Sent via pgsql-performance mailing list (pgsql-performance@xxxxxxxxxxxxxx) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-performance