Re: slow bitmap heap scans on pg 9.2

Steve Singer <ssinger@xxxxxxxxxxxxxxx> · Sat, 13 Apr 2013 12:14:10 -0400

On 13-04-12 09:20 PM, Jeff Janes wrote:
On Thursday, April 11, 2013, Steve Singer wrote:

    I think the reason why it is picking the hash join based plans is
    because of

    Index Scan using table_b_1_ptid_orgid_ym_unq on table_b_1 b
    (cost=0.00..503.86 rows=1 width=10) (actual time=0.016..0.017 rows=1
    loops=414249)
                         Index Cond: ((a.id <http://a.id> = a_id) AND
    (organization_id = 2) AND (year = 2013) AND (month = 3))
                         Filter: (product_id = 1)

Trying to reason about how the planner estimates costs for the inner
side of nested loops makes my head hurt.
So before doing that, could you run explain (analyze,buffers) on both of
these much simpler (but hopefully morally equivalent to this planner
node) sql:

select * from table_b_1_b where a_id = <some plausible value> and
organization_id=2 and year=2013 and month=3

select * from table_b_1_b where a_id = <some plausible value> and
organization_id=2 and year=2013 and month=3 and product_id=1

table_b_1 is a partition of table_b  on product_id so when querying 
table table_b_1 directly they are equivalent

explain (analyze,buffers) select * FROM table_b_1 where a_id=1128944 and 
organization_id=2 and year=2013 and month=3;

QUERY PLAN

-------------------------------------------------------------------------------------------------------------------------------------------
-----------
 Index Scan using table_b_1_ptid_orgid_ym_unq on table_b_1 
(cost=0.00..50.73 rows=1 width=56) (actual time=60.328..60.330 rows=
1 loops=1)
   Index Cond: ((a_id = 1128944) AND (organization_id = 2) AND (year = 
2013) AND (month = 3))
   Buffers: shared hit=1 read=5
 Total runtime: 60.378 ms
(4 rows)

The plans are the same if I do or do not specify the product_id in the 
where clause (if I query the parent table and neglect to query the query 
clause it of course queries all the other partitions)

Of particular interest here is whether the estimate of 1 row is due to
the specificity of the filter, or if the index clauses alone are
specific enough to drive that estimate.  (If you get many rows without
the product_id filter, that would explain the high estimate.).

The index clauses alone , we normally expect 1 row back for a query like 
that.

Please run with the default cost parameters, or if you can't get the
right plan with the defaults, specify what the used parameters were.

indexTotalCost += index->pages * spc_random_page_cost / 100000.0;

Is driving my high costs on the inner loop. The index has 2-5 million 
pages depending on the partition .   If I run this against 9.2.2 with / 
10000.0 the estimate is even higher.

If I try this with this with the

*indexTotalCost += log(1.0 + index->pages / 10000.0) * spc_random_page_cost;

from 9.3 and I play I can make this work I can it pick the plan on some 
partitions with product_id=2 but not product_id=1.   If I remove the 
fudge-factor cost adjustment line I get the nested-loop plan always.

Breaking the index into smaller partial indexes for each year seems to 
be giving me the plans I want with random_page_cost=2 (I might also try 
partial indexes on the month).

Even with the 9.3 log based fudge-factor we are seeing the fudge-factor 
being big enough so that the planner is picking a table scan over the 
index.  A lot of loop iterations can be satisfied by cached pages of the 
index the fudge-factor doesn't really account for this.

Cheers,

Jeff

--
Sent via pgsql-performance mailing list (pgsql-performance@xxxxxxxxxxxxxx)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-performance