Re: index problems (again)

Jeff Janes <jeff.janes@xxxxxxxxx> · Sat, 12 Mar 2016 22:40:10 -0800

On Tue, Mar 8, 2016 at 2:16 AM, Geoff Winkless <pgsqladmin@xxxxxxxx> wrote:
> On 7 March 2016 at 20:40, Peter J. Holzer <hjp-pgsql@xxxxxx> wrote:
>> As Tom wrote, the estimate of having to read only about 140 rows is only
>> valid if sc_id and sc_date are uncorrelated. In reality your query has
>> to read a lot more than 140 rows, so it is much slower.
>
> But as I've said previously, even if I do select from scdate values
> that I know to be in the first 1% of the data (supposedly the perfect
> condition) the scan method is insignificantly quicker than the index
> (scdate,scid) method.

That is sure not the case in my hands.  If I select from the first 1%,
I get the (scid) index being 10 times faster than (scdate,scid), and
(scid,scdate) being 50 times faster.

> Even with the absolute perfect storm (loading in the entire index for
> the full range) it's still not too bad (1.3 seconds or so).
>
> The point is that to assume, knowing nothing about the data, that the
> data is in an even distribution is only a valid strategy if the worst
> case (when that assumption turns out to be wildly incorrect) is not
> catastrophic. That's not the case here.

That just makes someone else's catastrophic worst case come to the fore.

Cheers,

Jeff

-- 
Sent via pgsql-general mailing list (pgsql-general@xxxxxxxxxxxxxx)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general