Re: Bitmap scan is undercosted? - overestimated correlation and cost_index

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On Wed, Dec 6, 2017 at 1:46 PM, Justin Pryzby <pryzby@xxxxxxxxxxxxx> wrote:
On Tue, Dec 05, 2017 at 01:50:11PM -0500, Tom Lane wrote:
> Jeff Janes <jeff.janes@xxxxxxxxx> writes:
> > On Dec 3, 2017 15:31, "Tom Lane" <tgl@xxxxxxxxxxxxx> wrote:
> >> Jeff Janes <jeff.janes@xxxxxxxxx> writes:
> >>> But I do see that ties within the logical order of the column values are
> >>> broken to agree with the physical order.  That is wrong, right?  Is there
> >>> any argument that this is desirable?
>
> >> Uh ... what do you propose doing instead?  We'd have to do something with
> >> ties, and it's not so obvious this way is wrong.
>
> > Let them be tied.
...
> I thought some more about this.  What we really want the correlation stat
> to do is help us estimate how randomly an index-ordered scan will access
> the heap.  If the values we've sampled are all unequal then there's no
> particular issue.  However, if we have some group of equal values, we
> do not really know what order an indexscan will visit them in.  The
> existing correlation calculation is making the *most optimistic possible*
> assumption, that such a group will be visited exactly in heap order ---
> and that assumption isn't too defensible.

I'm interested in discusstion regarding bitmap cost, since it would have helped
our case discussed here ~18 months ago:
https://www.postgresql.org/message-id/flat/20160524173914.GA11880%40telsasoft.com#20160524173914.GA11880@telsasoft.com

...but remember: in Vitaliy's case (as opposed to mine), the index scan is
*faster* but being estimated at higher cost than bitmap (I have to keep
reminding myself).  So the rest of this discussion is about the
overly-optimistic cost estimate of index scans, which moves in the opposite
direction for this reported problem.  For the test cases I looked at, index
scans were used when RPC=1 and redundant conditions were avoided, so I'm not
sure if there's any remaining issue (but I haven't looked at the latest cases
Vitaliy sent).

> In any case, given that we do this calculation without regard
> to any specific index,

One solution is to compute stats (at least correlation) for all indices, not
just expr inds.  I did that earlier this year while throwing around/out ideas.
https://www.postgresql.org/message-id/20170707234119.GN17566%40telsasoft.com

When is the correlation of a column which is not the leading column of a btree index or in a brin index ever used?  If we did compute index-specific correlations, we could maybe just drop pure-column correlations.



> We do have an idea, from the data we have, whether the duplicates are close
> together in the heap or spread all over.

I think you just mean pg_stats.correlation for all values, not just duplicates
(with the understanding that duplicates might be a large fraction of the
tuples, and high weight in correlation).

Another issue I noted in an earlier thread is that as table sizes increase, the
existing correlation computation approaches 1 for correlated insertions, (like
"append-only" timestamps clustered around now()), due to ANALYZE sampling a
fraction of the table, and thereby representing only large-scale correlation,

That isn't due to sampling.  That is due to the definition of linear correlation.  Large scale is what it is about.


Generated data demonstrating this (I reused this query so it's more complicated
than it needs to be):

[pryzbyj@database ~]$ time for sz in 9999{,9{,9{,9{,9}}}} ; do psql postgres -tc "DROP TABLE IF EXISTS t; CREATE TABLE t(i float, j int); CREATE INDEX ON t(i);INSERT INTO t SELECT i/99999.0+pow(2,(-random())) FROM generate_series(1,$sz) i ORDER BY i; ANALYZE t; SELECT $sz, correlation, most_common_freqs[1] FROM pg_stats WHERE attname='i' AND tablename='t'"; done

     9999 |    0.187146 |
    99999 |    0.900629 |
   999999 |    0.998772 |
  9999999 |    0.999987 |

Because the amount of jitter introduced is constant WRT $sz, but the range of i/99999.0 increases with $sz, the correlation actually does increase; it is not a sampling effect.

Trying to keep it all in my own head: For sufficiently large number of pages,
bitmap scan should be preferred to idx scan due to reduced random-page-cost
outweighing its overhead in CPU cost. 

But CPU cost is probably not why it is losing anyway.

Index scans get a double bonus from high correlation.  It assumes that only a small fraction of the table will be visited.  And then it assumes that the visits it does make will be largely sequential.  I think that you are saying that for a large enough table, that last assumption is wrong, that the residual amount of non-correlation is enough to make the table reads more random than sequential.  Maybe.  Do you have a test case that demonstrates this?  If so, how big do we need to go, and can you see the problem on SSD as well as HDD?

The thing is, the bitmap scan gets cheated out of one of these bonuses.  It gets no credit for visiting only a small part of the table when the correlation is high, but does get credit for being mostly sequential in whatever visits it does make (even if the correlation is low, because of course the bitmap perfects the correlation).

I think it would be easier to give bitmap scans their due rather than try to knock down index scans.  But of course a synthetic test case would go a long way to either one.
 

Probably by penalizing index scans, not
discounting bitmap scans.  Conceivably a correlation adjustment can be
conditionalized or weighted based on index_pages_fetched() ...
        x = ln (x/999999);
        if (x>1) correlation/=x;

I think we should go with something with some statistical principle behind it if we want to do that.  There is the notion of "restricted range" in correlations, that if there is a high correlation over the full range of data, but you zoom into only a small fraction of that range, the correlation you see over that restricted range will be much less than the full correlation is.  I don't know that field well enough to give a formula off the top of my head, but I think it will be based on the fraction of the key space which is being scanned, and (1-RSQ), rather than an arbitrary number like 999999.
 
Cheers,

Jeff

[Postgresql General]     [Postgresql PHP]     [PHP Users]     [PHP Home]     [PHP on Windows]     [Kernel Newbies]     [PHP Classes]     [PHP Books]     [PHP Databases]     [Yosemite]

  Powered by Linux