On Fri, Nov 18, 2022 at 12:46 PM Peter J. Holzer <hjp-pgsql@xxxxxx> wrote: > Both do a parallel index only scan. Both perform 0 heap fetches. > But one reads 27336 buffers (or about 22 bytes per index entry, which > sounds reasonable) while the other reads 9995216 buffers (or almost one > full buffer per row). Why? The entries should be dense in the index in > both cases and since it's an index only scan (and explain says there > were 0 heap fetches) I would not expect extra accesses. Where do these > buffer reads come from? The index-only scan processes an index leaf page at a time. When there is naturally a high correlation (or some kind of clustering) in how we access VM pages, we'll naturally be able to do more visibility checks covering more index tuples per VM page accessed. This is a less severe problem here than it would be with an equivalent pair of plain index scans, just because there are so few VM pages relative to heap pages. But it's more or less an analogous problem. You're really noticing it here because these index scans have very low selectivity -- which is kinda unusual in most environments. Also worth bearing in mind that it's unusual to have a perfectly random and uniformly distributed clustering of index tuples, which is what the index built via hashing exhibits. Even a v4 UUID index could easily have plenty of duplicates, which would probably do significantly better on the metric you've focussed on. -- Peter Geoghegan