Search Postgresql Archives

Re: Compound keys and foreign constraints

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On 4/2/04 12:33 PM, "Alvaro Herrera" <alvherre@dcc.uchile.cl> wrote:

>>                ->  Index Scan using message_recipients_i_recipient on
>> message_recipients r  (cost=0.00..30569.25 rows=30622 width=21) (actual
>> time=83.146..31609.149 rows=312741 loops=1)
> 
> This estimate is off by an order of magnitude.  Maybe you want to
> increase the statistic target for this column ...

Please explain.  The only thing I can find about this is

"For scalar data types, attstattarget is both the target number of ³most
common values² to collect, and the target number of histogram bins to
create."

Why would this affect the row count estimations?

With 10 being the default, what would an appropriate value be?

I changed it to 100 for both message_key and reciepent (in
message_recipients table).  That used the same plan and made the discrepancy
worse 13000 estimated, 312000 actual:

>             ->  Index Scan using message_recipients_i_recipient on
> message_recipients r  (cost=0.00..13324.75 rows=13249 width=21) (actual
> time=4751.701..34373.808 rows=312741 loops=1)

And, it still only uses the compound index if I set random_page_cost to < 1.

Analyze is run regularly by auto-vacuum (multiple times a day - I need to
reduce this).

If I read this right, before changing the statistic target:


db=# select relname, relfilenode from pg_class where relname like
'message_recip%';

            relname             | relfilenode
--------------------------------+-------------
 message_recipients_i_recipient |       17234
 message_recipients_i_message   |       17235
 message_recipients             |       17162
(3 rows)


db=> select attname, attrelid, attstattarget from pg_attribute where
attname='recipient';
  attname  | attrelid | attstattarget
-----------+----------+---------------
 recipient |    17162 |            -1
 recipient |    17234 |             0


db=# select staattnum, stadistinct from pg_statistic where starelid=17162;
 staattnum | stadistinct
-----------+-------------
         1 | 1.11264e+06
         2 |        3598
         3 |           2
(3 rows)

db=# select attname, attstattarget, attnum from pg_attribute where
attrelid=17162;
    attname     | attstattarget | attnum
----------------+---------------+--------
 tableoid       |             0 |     -7
 cmax           |             0 |     -6
 xmax           |             0 |     -5
 cmin           |             0 |     -4
 xmin           |             0 |     -3
 oid            |             0 |     -2
 ctid           |             0 |     -1
 message_key    |            -1 |      1
 recipient      |            -1 |      2
 recipient_type |            -1 |      3


There are 20+ million (unique) message_key records, so stadistinct for
column 1 is off by a factor of 20.  There are just under 1.2 million unique
recipients (60 million total records), so column 2 (recipient) is off by a
factor of over 300.

In the messages table, message_key (which is unique) has statdistinct = -1


db=# alter table message_recipients alter column recipient set statistics
100;
db=# alter table message_recipients alter column message_key set statistics
100;
db=# analyze;

db=# select distinct relname, attname, attnum, attstattarget, stadistinct
from pg_class c, pg_attribute a, pg_statistic s where
c.relfilenode=a.attrelid and c.relfilenode=starelid and a.attnum=s.staattnum
and relname='message_recipients' order by attnum;

      relname       |    attname     | attnum | attstattarget | stadistinct
--------------------+----------------+--------+---------------+-------------
 message_recipients | message_key    |      1 |           100 | 2.19256e+06
 message_recipients | recipient      |      2 |           100 |        8672
 message_recipients | recipient_type |      3 |            -1 |           2

Stadistinct is about twice what it was before, but is still way off the
number of distinct values for that column.  Manually setting stadistinct
closer to the number of unique values makes the discrepancy still larger.

Wes


---------------------------(end of broadcast)---------------------------
TIP 4: Don't 'kill -9' the postmaster


[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
[Index of Archives]     [Postgresql Jobs]     [Postgresql Admin]     [Postgresql Performance]     [Linux Clusters]     [PHP Home]     [PHP on Windows]     [Kernel Newbies]     [PHP Classes]     [PHP Books]     [PHP Databases]     [Postgresql & PHP]     [Yosemite]
  Powered by Linux