On Mon, Aug 17, 2009 at 2:07 PM, Slava Moudry<smoudry@xxxxxxxxx> wrote: > Hi, > > I am using int8 field to pack a number of error flags. This is very common > technique for large tables to pack multiple flags in one integer field. > > For most records – the mt_flags field is 0. Here is the statistics (taken > from pgAdmin Statistics tab for mt_flags column): > > Most common Values: {0,128,2,4,8) > > Most common Frequencies: {0.96797,0.023,0.0076,0.0005,0.00029) > > What I notice that when bit-AND function is used – Postgres significantly > underestimates the amount of rows: > > explain analyze select count(*) from mt__20090801 where mt_flags&8=0; > > QUERY PLAN > ----------------------------------------------------------------------------------------------------------------------------- > > Aggregate (cost=83054.43..83054.44 rows=1 width=0) (actual > time=2883.154..2883.154 rows=1 loops=1) > > -> Seq Scan on mt__20090801 (cost=0.00..83023.93 rows=12200 width=0) > (actual time=0.008..2100.390 rows=2439435 loops=1) > > Filter: ((mt_flags & 8) = 0) > > Total runtime: 2883.191 ms > > (4 rows) > > This is not an issue for the particular query above, but I noticed that due > to that miscalculation in many cases Postgres chooses plan with Nested Loops > for other queries. I can fix it by setting enable_nest_loops to off, but > it's not something I should set for all queries. > > Is there any way to help Postgres make a better estimation for number of > rows returned by bit function? You can index on the function. For instance: create table t (mt_flags int); create index t_mtflags_bit on t ((mt_flags&8)); insert into t select case when random() > 0.95 then case when random() >0.5 then 8 else 12 end else 0 end from generate_series(1,10000); analyze t; explain select * from t where mt_flags&8=8; QUERY PLAN -------------------------------------------------------------------------- Index Scan using t_mtflags_bit on t (cost=0.00..52.17 rows=467 width=4) Index Cond: ((mt_flags & 8) = 8) (2 rows) Hope that helps a little. -- Sent via pgsql-performance mailing list (pgsql-performance@xxxxxxxxxxxxxx) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-performance