2011/2/27 Віталій Тимчишин <tivv00@xxxxxxxxx>: > > > 27 лютого 2011 р. 19:59 Robert Haas <robertmhaas@xxxxxxxxx> написав: >> >> 2011/2/4 Віталій Тимчишин <tivv00@xxxxxxxxx>: >> > Hi, all. >> > All this optimizer vs hint thread reminded me about crazy idea that got >> > to >> > my head some time ago. >> > I currently has two problems with postgresql optimizer >> > 1) Dictionary tables. Very usual thing is something like "select * from >> > big_table where distionary_id = (select id from dictionary where >> > name=value)". This works awful if dictionary_id distribution is not >> > uniform. >> >> Does it work better if you write it as a join? >> >> SELECT b.* FROM big_table b, dictionary d WHERE b.dictionary_id = d.id >> AND d.name = 'value' >> >> I would like to see a concrete example of this not working well, >> because I've been writing queries like this (with MANY tables) for >> years and it's usually worked very well for me. >> > Here you are: > PostgreSQL 8.4.7 on x86_64-pc-linux-gnu, compiled by GCC gcc-4.4.real > (Ubuntu 4.4.3-4ubuntu5) 4.4.3, 64-bit > create table a(dict int4, val int4); > create table b(dict int4, name text); > create index c on a(dict); > insert into b values (1, 'small'), (2, 'large'); > insert into a values (1,1); > insert into a select 2,generate_series(1,10000); > analyze a; > analyze b; > test=# explain analyze select * from a where dict=1; > QUERY PLAN > > ----------------------------------------------------------------------------------------------------- > Index Scan using c on a (cost=0.00..8.27 rows=1 width=8) (actual > time=0.014..0.016 rows=1 loops=1) > Index Cond: (dict = 1) > Total runtime: 0.041 ms > (3 rows) > test=# explain analyze select * from a where dict=2; > QUERY PLAN > > ----------------------------------------------------------------------------------------------------- > Seq Scan on a (cost=0.00..170.01 rows=10000 width=8) (actual > time=0.014..6.876 rows=10000 loops=1) > Filter: (dict = 2) > Total runtime: 13.419 ms > (3 rows) > test=# explain analyze select * from a,b where a.dict=b.dict and > b.name='small'; > QUERY PLAN > > ----------------------------------------------------------------------------------------------------------- > Hash Join (cost=1.04..233.55 rows=5000 width=18) (actual > time=0.047..13.159 rows=1 loops=1) > Hash Cond: (a.dict = b.dict) > -> Seq Scan on a (cost=0.00..145.01 rows=10001 width=8) (actual > time=0.009..6.633 rows=10001 loops=1) > -> Hash (cost=1.02..1.02 rows=1 width=10) (actual time=0.011..0.011 > rows=1 loops=1) > -> Seq Scan on b (cost=0.00..1.02 rows=1 width=10) (actual > time=0.006..0.008 rows=1 loops=1) > Filter: (name = 'small'::text) > Total runtime: 13.197 ms > (7 rows) > test=# explain analyze select * from a,b where a.dict=b.dict and > b.name='large'; > QUERY PLAN > > ----------------------------------------------------------------------------------------------------------- > Hash Join (cost=1.04..233.55 rows=5000 width=18) (actual > time=0.074..21.476 rows=10000 loops=1) > Hash Cond: (a.dict = b.dict) > -> Seq Scan on a (cost=0.00..145.01 rows=10001 width=8) (actual > time=0.012..7.085 rows=10001 loops=1) > -> Hash (cost=1.02..1.02 rows=1 width=10) (actual time=0.021..0.021 > rows=1 loops=1) > -> Seq Scan on b (cost=0.00..1.02 rows=1 width=10) (actual > time=0.015..0.016 rows=1 loops=1) > Filter: (name = 'large'::text) > Total runtime: 28.293 ms > (7 rows) > It simply don't know that small=1 and large=2, so it never uses nested loop > + iindex scan: > test=# set enable_hashjoin=false; > SET > test=# explain analyze select * from a,b where a.dict=b.dict and > b.name='small'; > QUERY PLAN > > ---------------------------------------------------------------------------------------------------------------- > Nested Loop (cost=0.00..253.28 rows=5000 width=18) (actual > time=0.041..0.047 rows=1 loops=1) > -> Seq Scan on b (cost=0.00..1.02 rows=1 width=10) (actual > time=0.010..0.012 rows=1 loops=1) > Filter: (name = 'small'::text) > -> Index Scan using c on a (cost=0.00..189.75 rows=5000 width=8) > (actual time=0.021..0.023 rows=1 loops=1) > Index Cond: (a.dict = b.dict) > Total runtime: 0.089 ms > (6 rows) Oh, I see. Interesting example. -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company -- Sent via pgsql-performance mailing list (pgsql-performance@xxxxxxxxxxxxxx) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-performance