Hi Scott, Thank you for reply. I am using Postgres 8.4.0 (btw - great release --very happy about it) and I got a different plan after following your advice: create index t_mtflags_bit on staging.tmp_t ((mt_flags&8)); analyze staging.tmp_t; explain analyze select count(*) from staging.tmp_t where mt_flags&8=0; QUERY PLAN ------------------------------------------------------------------------------------------------------------------------ Aggregate (cost=89122.78..89122.79 rows=1 width=0) (actual time=2994.970..2994.971 rows=1 loops=1) -> Seq Scan on tmp_t (cost=0.00..83023.93 rows=2439541 width=0) (actual time=0.012..2161.886 rows=2439435 loops=1) Filter: ((mt_flags & 8) = 0) Total runtime: 2995.017 ms (4 rows) The seq scan is OK, since I don't expect Postgres to use index scan for such low-selective condition. It would be tough for me to support indexes for each bit flag value and their combinations. E.g. in the query below it is again 200x off on number of rows. explain analyze select count(*) from staging.tmp_t where mt_flags&134=0; QUERY PLAN ---------------------------------------------------------------------------------------------------------------------- Aggregate (cost=83054.43..83054.44 rows=1 width=0) (actual time=2964.960..2964.960 rows=1 loops=1) -> Seq Scan on tmp_t (cost=0.00..83023.93 rows=12200 width=0) (actual time=0.014..2152.031 rows=2362257 loops=1) Filter: ((mt_flags & 134) = 0) Total runtime: 2965.009 ms (4 rows) I still wonder if it's something I could/should report as a bug? I've been struggling with this issue in 8.2, 8.3.x (now using 8.4.0). We can more or less work around this by disabling nestloop in our analytics queries but I have problems enforcing this in reporting applications. Thanks, -Slava Moudry. -----Original Message----- From: Scott Marlowe [mailto:scott.marlowe@xxxxxxxxx] Sent: Tuesday, August 18, 2009 12:09 AM To: Slava Moudry Cc: pgsql-performance@xxxxxxxxxxxxxx Subject: Re: number of rows estimation for bit-AND operation On Mon, Aug 17, 2009 at 2:07 PM, Slava Moudry<smoudry@xxxxxxxxx> wrote: > Hi, > > I am using int8 field to pack a number of error flags. This is very common > technique for large tables to pack multiple flags in one integer field. > > For most records - the mt_flags field is 0. Here is the statistics (taken > from pgAdmin Statistics tab for mt_flags column): > > Most common Values: {0,128,2,4,8) > > Most common Frequencies: {0.96797,0.023,0.0076,0.0005,0.00029) > > What I notice that when bit-AND function is used - Postgres significantly > underestimates the amount of rows: > > explain analyze select count(*) from mt__20090801 where mt_flags&8=0; > > QUERY PLAN > ----------------------------------------------------------------------------------------------------------------------------- > > Aggregate (cost=83054.43..83054.44 rows=1 width=0) (actual > time=2883.154..2883.154 rows=1 loops=1) > > -> Seq Scan on mt__20090801 (cost=0.00..83023.93 rows=12200 width=0) > (actual time=0.008..2100.390 rows=2439435 loops=1) > > Filter: ((mt_flags & 8) = 0) > > Total runtime: 2883.191 ms > > (4 rows) > > This is not an issue for the particular query above, but I noticed that due > to that miscalculation in many cases Postgres chooses plan with Nested Loops > for other queries. I can fix it by setting enable_nest_loops to off, but > it's not something I should set for all queries. > > Is there any way to help Postgres make a better estimation for number of > rows returned by bit function? You can index on the function. For instance: create table t (mt_flags int); create index t_mtflags_bit on t ((mt_flags&8)); insert into t select case when random() > 0.95 then case when random() >0.5 then 8 else 12 end else 0 end from generate_series(1,10000); analyze t; explain select * from t where mt_flags&8=8; QUERY PLAN -------------------------------------------------------------------------- Index Scan using t_mtflags_bit on t (cost=0.00..52.17 rows=467 width=4) Index Cond: ((mt_flags & 8) = 8) (2 rows) Hope that helps a little. -- Sent via pgsql-performance mailing list (pgsql-performance@xxxxxxxxxxxxxx) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-performance