2009/7/20 Віталій Тимчишин <tivv00@xxxxxxxxx>: > 20 липня 2009 р. 11:02 Chris <dmagick@xxxxxxxxx> написав: >> >> Віталій Тимчишин wrote: >>> >>> >>> 2009/7/20 Robert James <srobertjames@xxxxxxxxx >>> <mailto:srobertjames@xxxxxxxxx>> >>> >>> >>> Hi. I notice that when I do a WHERE x, Postgres uses an index, and >>> when I do WHERE y, it does so as well, but when I do WHERE x OR y, >>> it doesn't. Why is this so? >>> >>> It's not clever enough. >> >> Of course it is. > > For simple cases > >> >> >> I'm running 8.3.7. >> >> create table t1(id int primary key); >> insert into t1(id) select a from generate_series(1, 500000) as s(a); >> analyze t1; > > explain analyze select * from t1 where > id < 10000 > > "Index Scan using t1_pkey on t1 (cost=0.00..322.51 rows=9612 width=4) > (actual time=0.030..3.700 rows=9999 loops=1)" > " Index Cond: (id < 10000)" > "Total runtime: 4.835 ms" > > explain analyze select * from t1 where > id in (select (random() * 500000)::int4 from generate_series(0,10)) > > "Nested Loop (cost=32.50..1341.49 rows=200 width=4) (actual > time=15.353..67.014 rows=11 loops=1)" > " -> HashAggregate (cost=32.50..34.50 rows=200 width=4) (actual > time=0.028..0.043 rows=11 loops=1)" > " -> Function Scan on generate_series (cost=0.00..20.00 rows=1000 > width=0) (actual time=0.014..0.020 rows=11 loops=1)" > " -> Index Scan using t1_pkey on t1 (cost=0.00..6.52 rows=1 width=4) > (actual time=6.083..6.084 rows=1 loops=11)" > " Index Cond: (t1.id = (((random() * 500000::double > precision))::integer))" > "Total runtime: 67.070 ms" > > explain analyze select * from t1 where > id in (select (random() * 500000)::int4 from generate_series(0,10)) > or > id < 10000 > > "Seq Scan on t1 (cost=22.50..9735.50 rows=254806 width=4) (actual > time=0.049..148.947 rows=10010 loops=1)" > " Filter: ((hashed subplan) OR (id < 10000))" > " SubPlan" > " -> Function Scan on generate_series (cost=0.00..20.00 rows=1000 > width=0) (actual time=0.014..0.019 rows=11 loops=1)" > "Total runtime: 150.123 ms" > > explain analyze > select * from t1 where > id in (select (random() * 500000)::int4 from generate_series(0,10)) > union > select * from t1 where > id < 10000 > > "Unique (cost=2412.68..2461.74 rows=9812 width=4) (actual > time=89.190..95.014 rows=10010 loops=1)" > " -> Sort (cost=2412.68..2437.21 rows=9812 width=4) (actual > time=89.189..91.167 rows=10010 loops=1)" > " Sort Key: public.t1.id" > " Sort Method: quicksort Memory: 854kB" > " -> Append (cost=32.50..1762.13 rows=9812 width=4) (actual > time=16.641..76.338 rows=10010 loops=1)" > " -> Nested Loop (cost=32.50..1341.49 rows=200 width=4) > (actual time=16.641..70.051 rows=11 loops=1)" > " -> HashAggregate (cost=32.50..34.50 rows=200 width=4) > (actual time=0.033..0.049 rows=11 loops=1)" > " -> Function Scan on generate_series > (cost=0.00..20.00 rows=1000 width=0) (actual time=0.020..0.026 rows=11 > loops=1)" > " -> Index Scan using t1_pkey on t1 (cost=0.00..6.52 > rows=1 width=4) (actual time=6.359..6.361 rows=1 loops=11)" > " Index Cond: (public.t1.id = (((random() * > 500000::double precision))::integer))" > " -> Index Scan using t1_pkey on t1 (cost=0.00..322.51 > rows=9612 width=4) (actual time=0.023..4.075 rows=9999 loops=1)" > " Index Cond: (id < 10000)" > "Total runtime: 112.694 ms" Hmm. What you're suggesting here is that we could consider implementing OR conditions by rescanning the inner side for each index qual and then unique-ifying the results on the index column. That's probably possible, but it doesn't sound easy, especially since our selectivity-estimation code for OR conditions is not very good, so we might choose to do it this way when that's not actually the best plan. ...Robert -- Sent via pgsql-performance mailing list (pgsql-performance@xxxxxxxxxxxxxx) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-performance