On Tue, Feb 19, 2019 at 09:29:46PM -0700, Michael Lewis wrote: > On Tue, Feb 19, 2019, 8:00 PM Andrew Gierth <andrew@xxxxxxxxxxxxxxxxxxxx> wrote: > > > >>>>> "Abi" == Abi Noda <a@xxxxxxxxxxx> writes: > > Abi> However, when I index the closed column, a bitmap scan is used > > Abi> instead of an index scan, with slightly slower performance. Why > > Abi> isn't an index scan being used, given that the exact same number > > Abi> of rows are at play as in my query on the state column? > > > > Most likely difference is the correlation estimate for the conditions. > > The cost of an index scan includes a factor based on how well correlated > > the physical position of rows is with the index order, because this > > affects the number of random seeks in the scan. But for nulls this > > estimate cannot be performed, and bitmapscan is cheaper than plain > > indexscan on poorly correlated data. > > Does this imply that the optimizer would always prefer the bitmapscan > rather than index scan even if random page cost = 1, aka sequential cost, > when the correlation is unknown like a null? Or only when it thinks random > access is more expensive by some significant factor? No; for one, since for a bitmap scan, the heap scan can't begin until the index scan is done, so there's a high(er) initial cost. Otherwise bitmap scan could always be used and all access could be ordered (even if not sequential). Justin