Re: Query performance with disabled hashjoin and mergejoin

Merlin Moncure <mmoncure@xxxxxxxxx> · Mon, 7 Mar 2011 14:40:48 -0600

On Tue, Feb 22, 2011 at 9:07 PM, Robert Haas <robertmhaas@xxxxxxxxx> wrote:
> On Fri, Feb 4, 2011 at 7:08 AM, Ivan Voras <ivoras@xxxxxxxxxxx> wrote:
>>                                 ->  BitmapAnd  (cost=1282.94..1282.94
>> rows=1430 width=0) (actual time=5.508..5.508 rows=0 loops=1)
>>                                       ->  Bitmap Index Scan on
>> news_index_layout_id_state  (cost=0.00..150.14 rows=2587 width=0) (actual
>> time=0.909..0.909 rows=3464 loops=1)
>>                                             Index Cond: ((layout_id = 8980)
>> AND (state = 2))
>>                                       ->  BitmapOr (cost=1132.20..1132.20
>> rows=20127 width=0) (actual time=4.136..4.136 rows=0 loops=1)
>>                                             ->  Bitmap Index Scan on
>> news_visible_from  (cost=0.00..1122.09 rows=19976 width=0) (actual
>> time=3.367..3.367 rows=19932 loops=1)
>>                                                   Index Cond: (visible_from
>> IS NULL)
>>                                             ->  Bitmap Index Scan on
>> news_visible_to  (cost=0.00..9.40 rows=151 width=0) (actual
>> time=0.766..0.766 rows=43 loops=1)
>>                                                   Index Cond: (1296806570 <=
>> visible_to)
>
> I think this part of the query is the problem.  Since the planner
> doesn't support cross-column statistics, it can't spot the correlation
> between these different search conditions, resulting in a badly broken
> selectivity estimate.
>
> Sometimes you can work around this by adding a single column, computed
> with a trigger, that contains enough information to test the whole
> WHERE-clause condition using a single indexable test against the
> column value.  Or sometimes you can get around it by partitioning the
> data into multiple tables, say with the visible_from IS NULL rows in a
> different table from the rest.

Why should you need cross column statistics for this case?  You should
be able to multiple selectivity from left to right as long as you are
doing equality comparisons, yes?

Right now the planner is treating
select * from foo where (a,b,c) between (1,1,1) and (9,9,9) the same
(using selectivity on a) as
select * from foo where (a,b,c) between (1,1,5) and (1,1,7)

but they are not the same. since in the second query terms a,b are
equal, shouldn't you able to multiply the selectivity through?

merlin

-- 
Sent via pgsql-performance mailing list (pgsql-performance@xxxxxxxxxxxxxx)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-performance