Re: number of rows estimation for bit-AND operation

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On Mon, Aug 17, 2009 at 2:07 PM, Slava Moudry<smoudry@xxxxxxxxx> wrote:
> Hi,
>
> I am using int8 field to pack a number of error flags. This is very common
> technique for large tables to pack multiple flags in one integer field.
>
> For most records – the mt_flags field is 0. Here is the statistics (taken
> from pgAdmin Statistics tab for mt_flags column):
>
> Most common Values: {0,128,2,4,8)
>
> Most common Frequencies: {0.96797,0.023,0.0076,0.0005,0.00029)
>
> What I notice that when bit-AND function is used – Postgres significantly
> underestimates the amount of rows:
>
> explain analyze select count(*) from mt__20090801 where  mt_flags&8=0;
>
>                               QUERY PLAN
> -----------------------------------------------------------------------------------------------------------------------------
>
>  Aggregate  (cost=83054.43..83054.44 rows=1 width=0) (actual
> time=2883.154..2883.154 rows=1 loops=1)
>
>    ->  Seq Scan on mt__20090801  (cost=0.00..83023.93 rows=12200 width=0)
> (actual time=0.008..2100.390 rows=2439435 loops=1)
>
>          Filter: ((mt_flags & 8) = 0)
>
>  Total runtime: 2883.191 ms
>
> (4 rows)
>
> This is not an issue for the particular query above, but I noticed that due
> to that miscalculation in many cases Postgres chooses plan with Nested Loops
> for other queries. I can fix it by setting enable_nest_loops to off, but
> it's not something I should set for all queries.
>
> Is there any way to help Postgres make a better estimation for number of
> rows returned by bit function?

You can index on the function.  For instance:

create table t (mt_flags int);
create index t_mtflags_bit on t ((mt_flags&8));
insert into t select case when random() > 0.95 then case when random()
>0.5 then 8 else 12 end else 0 end from generate_series(1,10000);
analyze t;
explain select * from t where mt_flags&8=8;
                                QUERY PLAN
--------------------------------------------------------------------------
 Index Scan using t_mtflags_bit on t  (cost=0.00..52.17 rows=467 width=4)
   Index Cond: ((mt_flags & 8) = 8)
(2 rows)

Hope that helps a little.

-- 
Sent via pgsql-performance mailing list (pgsql-performance@xxxxxxxxxxxxxx)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-performance


[Postgresql General]     [Postgresql PHP]     [PHP Users]     [PHP Home]     [PHP on Windows]     [Kernel Newbies]     [PHP Classes]     [PHP Books]     [PHP Databases]     [Yosemite]

  Powered by Linux