Re: Setting Statistics on Functional Indexes

Claudio Freire <klaussfreire@xxxxxxxxx> · Fri, 26 Oct 2012 19:05:27 -0300

On Fri, Oct 26, 2012 at 7:04 PM, Claudio Freire <klaussfreire@xxxxxxxxx> wrote:
> On Fri, Oct 26, 2012 at 7:01 PM, Tom Lane <tgl@xxxxxxxxxxxxx> wrote:
>> Claudio Freire <klaussfreire@xxxxxxxxx> writes:
>>> Because once you've accessed that last index page, it would be rather
>>> trivial finding out how many duplicate tids are in that page and, with
>>> a small CPU cost (no disk access if you don't query other index pages)
>>> you could verify the assumption of near-uniqueness.
>>
>> I thought about that too, but I'm not sure how promising the idea is.
>> In the first place, it's not clear when to stop counting duplicates, and
>> in the second, I'm not sure we could get away with not visiting the heap
>> to check for tuple liveness.  There might be a lot of apparent
>> duplicates in the index that just represent unreaped old versions of a
>> frequently-updated endpoint tuple.  (The existing code is capable of
>> returning a "wrong" answer if the endpoint tuple is dead, but I don't
>> think it matters much in most cases.  I'm less sure such an argument
>> could be made for dup-counting.)
>
> Would checking the visibility map be too bad? An index page worth of
> tuples should also fit within a page in the visibility map.

Scratch that, they're sorted by tid. So it could be lots of pages in
random order.

-- 
Sent via pgsql-performance mailing list (pgsql-performance@xxxxxxxxxxxxxx)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-performance