On Fri, Oct 26, 2012 at 7:04 PM, Claudio Freire <klaussfreire@xxxxxxxxx> wrote: > On Fri, Oct 26, 2012 at 7:01 PM, Tom Lane <tgl@xxxxxxxxxxxxx> wrote: >> Claudio Freire <klaussfreire@xxxxxxxxx> writes: >>> Because once you've accessed that last index page, it would be rather >>> trivial finding out how many duplicate tids are in that page and, with >>> a small CPU cost (no disk access if you don't query other index pages) >>> you could verify the assumption of near-uniqueness. >> >> I thought about that too, but I'm not sure how promising the idea is. >> In the first place, it's not clear when to stop counting duplicates, and >> in the second, I'm not sure we could get away with not visiting the heap >> to check for tuple liveness. There might be a lot of apparent >> duplicates in the index that just represent unreaped old versions of a >> frequently-updated endpoint tuple. (The existing code is capable of >> returning a "wrong" answer if the endpoint tuple is dead, but I don't >> think it matters much in most cases. I'm less sure such an argument >> could be made for dup-counting.) > > Would checking the visibility map be too bad? An index page worth of > tuples should also fit within a page in the visibility map. Scratch that, they're sorted by tid. So it could be lots of pages in random order. -- Sent via pgsql-performance mailing list (pgsql-performance@xxxxxxxxxxxxxx) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-performance