Re: Setting Statistics on Functional Indexes

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On 10/24/2012 02:11 PM, Tom Lane wrote:

It's not particularly (not that you've even defined what you think
"optimistic" is, much less mentioned what baseline you're comparing
to).

The main flaw with my example is that it's random. But I swear I'm not making it up! :)

There seems to be a particularly nasty edge case we're triggering, then. Like I said, it's worse when col1+col2 don't match anything. In that case, it's using the trunc index on the date column, which has demonstrably worse performance. Here are the two analyzes I got before/after front-loading statistics.

Before stats increase:

Sort (cost=9.38..9.39 rows=1 width=23) (actual time=78.282..78.282 rows=0 loops=1)
   Sort Key: id, action_date
   Sort Method: quicksort  Memory: 25kB
-> Index Scan using idx_date_test_action_date_trunc on date_test (cost=0.00..9.37 rows=1 width=23) (actual time=78.274..78.274 rows=0 loops=1) Index Cond: (date_trunc('day'::text, action_date) >= '2012-10-24 00:00:00'::timestamp without time zone)
         Filter: (((col1)::text = 'S:96'::text) AND (col2 = 657::numeric))
 Total runtime: 78.317 ms


And then after. I used your unofficial trick to set it to 1000:

alter index idx_date_test_action_date_trunc
      alter column date_trunc set statistics 1000;
analyze date_test;


Sort (cost=9.83..9.83 rows=1 width=23) (actual time=0.038..0.038 rows=0 loops=1)
   Sort Key: id, action_date
   Sort Method: quicksort  Memory: 25kB
-> Index Scan using idx_date_test_col1_col2 on date_test (cost=0.00..9.82 rows=1 width=23) (actual time=0.030..0.030 rows=0 loops=1) Index Cond: (((col1)::text = 'S:96'::text) AND (col2 = 657::numeric)) Filter: (date_trunc('day'::text, action_date) >= '2012-10-24 00:00:00'::timestamp without time zone)
 Total runtime: 0.066 ms


This is on a bone-stock PG 9.1.6 from Ubuntu 12.04 LTS, with default_statistics increased to 500. The only thing I bumped up was the functional index between those two query plans.

But then I noticed something else. I reverted back to the old 500 default for everything, and added an index:

create index idx_date_test_action_date_trunc_col1
    on date_test (date_trunc('day', action_date), col1);

I think we can agree that this index would be more selective than the one on date_trunc by itself. Yet:

Sort (cost=9.38..9.39 rows=1 width=23) (actual time=77.055..77.055 rows=0 loops=1)
   Sort Key: id, action_date
   Sort Method: quicksort  Memory: 25kB
-> Index Scan using idx_date_test_action_date_trunc on date_test (cost=0.00..9.37 rows=1 width=23) (actual time=77.046..77.046 rows=0 loops=1) Index Cond: (date_trunc('day'::text, action_date) >= '2012-10-24 00:00:00'::timestamp without time zone)
         Filter: (((col1)::text = 'S:96'::text) AND (col2 = 657::numeric))
 Total runtime: 77.091 ms


All I have to say about that is: wat.

--
Shaun Thomas
OptionsHouse | 141 W. Jackson Blvd. | Suite 500 | Chicago IL, 60604
312-444-8534
sthomas@xxxxxxxxxxxxxxxx

______________________________________________

See http://www.peak6.com/email_disclaimer/ for terms and conditions related to this email


--
Sent via pgsql-performance mailing list (pgsql-performance@xxxxxxxxxxxxxx)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-performance


[Postgresql General]     [Postgresql PHP]     [PHP Users]     [PHP Home]     [PHP on Windows]     [Kernel Newbies]     [PHP Classes]     [PHP Books]     [PHP Databases]     [Yosemite]

  Powered by Linux