Re: Talking about optimizer, my long dream

Robert Haas <robertmhaas@xxxxxxxxx> · Sun, 27 Feb 2011 12:59:28 -0500

2011/2/4 ÐÑÑÐÐÑÐ ÐÐÐÑÐÑÐÐ <tivv00@xxxxxxxxx>:
> Hi, all.
> All this optimizer vs hint thread reminded me about crazy idea that got to
> my head some time ago.
> I currently has two problems with postgresql optimizer
> 1) Dictionary tables. Very usual thing is something like "select * from
> big_table where distionary_id = (select id from dictionary where
> name=value)". This works awful if dictionary_id distribution is not uniform.

Does it work better if you write it as a join?

SELECT b.* FROM big_table b, dictionary d WHERE b.dictionary_id = d.id
AND d.name = 'value'

I would like to see a concrete example of this not working well,
because I've been writing queries like this (with MANY tables) for
years and it's usually worked very well for me.

> The thing that helps is to retrieve subselect value and then simply do
> "select * from big_table where dictionary_id=id_value".
> 2) Complex queries. If there are over 3 levels of subselects, optmizer
> counts often become less and less correct as we go up on levels. On ~3rd
> level this often lead to wrong choises. The thing that helps is to create
> temporary tables from subselects, analyze them and then do main select using
> this temporary tables.
> While first one can be fixed by introducing some correlation statistics, I
> don't think there is any simple way to fix second one.
> But what if optimizer could in some cases tell "fetch this and this and then
> I'll plan other part of the query based on statistics of what you've
> fetched"?

I've had that thought, too.  It's pretty hard to see how to make ti
work, but I think there are cases where it could be beneficial.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

-- 
Sent via pgsql-performance mailing list (pgsql-performance@xxxxxxxxxxxxxx)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-performance