how to interpret/improve bad row estimates

Robert Treat <xzilla@xxxxxxxxxxxxxxxxxxxxx> · 23 Feb 2006 11:29:32 -0500

postgresql 8.1, I have two tables, bot hoth vacuumed and analyzed. on
msg307 I have altered the entityid and msgid columns statistics values
to 400. 

dev20001=# explain analyze  SELECT ewm.entity_id, m.agentname, m.filecreatedate AS versioninfo
   FROM msg307 m join entity_watch_map ewm on (ewm.entity_id = m.entityid AND ewm.msgid = m.msgid AND ewm.msg_type = 307);

                                                                  QUERY PLAN                                                                   
-----------------------------------------------------------------------------------------------------------------------------------------------
 Nested Loop  (cost=6.62..5227.40 rows=1 width=36) (actual time=0.583..962.346 rows=75322 loops=1)
   ->  Bitmap Heap Scan on entity_watch_map ewm  (cost=6.62..730.47 rows=748 width=8) (actual time=0.552..7.017 rows=1264 loops=1)
         Recheck Cond: (msg_type = 307)
         ->  Bitmap Index Scan on ewm_msg_type  (cost=0.00..6.62 rows=748 width=0) (actual time=0.356..0.356 rows=1264 loops=1)
               Index Cond: (msg_type = 307)
   ->  Index Scan using msg307_entityid_msgid_idx on msg307 m  (cost=0.00..6.00 rows=1 width=40) (actual time=0.011..0.295 rows=60 loops=1264)
         Index Cond: (("outer".entity_id = m.entityid) AND ("outer".msgid = m.msgid))
 Total runtime: 1223.469 ms
(8 rows)

I guess that the planner can not tell there is no correlation between
the distinctness of those two columns, and so makes a really bad
estimate on the indexscan, and pushes that estimate up into the nested
loop? (luckily in this case doing an index scan is generally a good
idea, so it works out, but it wouldn't always be a good idea) 

some pg_statistics information for those two columns
entityid:
starelid    | 25580
staattnum   | 1
stanullfrac | 0
stawidth    | 4
stadistinct | 1266
stakind1    | 1
stakind2    | 2
stakind3    | 3
stakind4    | 0
staop1      | 96
staop2      | 97
staop3      | 97
staop4      | 0
stanumbers1 | {0.00222976,0.00222976,0.00153048,0.00137216,0.00137216}
stanumbers2 | 
stanumbers3 | {0.100312}
stanumbers4 | 

msgid:
starelid    | 25580
staattnum   | 2
stanullfrac | 0
stawidth    | 4
stadistinct | 1272
stakind1    | 1
stakind2    | 2
stakind3    | 3
stakind4    | 0
staop1      | 96
staop2      | 97
staop3      | 97
staop4      | 0
stanumbers1 | {0.00164923,0.00163604,0.00163604,0.00163604,0.00137216}
stanumbers2 | 
stanumbers3 | {-0.0660856}
stanumbers4 | 

is my interpretation of why i am seeing such bad estimates correct? I
don't really think it is, because looking at a similar scenario on a 7.3
machine:

------------------------------------------------------------------------------------------------------------------------------------------------------
 Merge Join  (cost=1531.39..5350.90 rows=1 width=48) (actual time=118.44..899.37 rows=58260 loops=1)
   Merge Cond: (("outer".entityid = "inner".entity_id) AND ("outer".msgid = "inner".msgid))
   ->  Index Scan using msg307_entityid_msgid_idx on msg307 m  (cost=0.00..3669.42 rows=58619 width=40) (actual time=0.31..390.01 rows=58619 loops=1)
   ->  Sort  (cost=1531.39..1533.16 rows=709 width=8) (actual time=118.09..157.45 rows=58218 loops=1)
         Sort Key: ewm.entity_id, ewm.msgid
         ->  Seq Scan on entity_watch_map ewm  (cost=0.00..1497.80 rows=709 width=8) (actual time=0.14..114.74 rows=1157 loops=1)
               Filter: (msg_type = 307)
 Total runtime: 951.23 msec
(8 rows)

It still has the bad estimate at the nested loop stage, but it does seem
to have a better understanding of the # of rows it will return in the
index scan on msg307. This leads me to wonder if there something I could
do to improve the estimates on the 8.1 machine? 

Robert Treat
-- 
Build A Brighter Lamp :: Linux Apache {middleware} PostgreSQL