Rowcount estimation changes based on from clause order

Ants Aasma <ants.aasma@xxxxxxxx> · Wed, 11 Oct 2017 12:33:31 +0300

I stumbled upon a severe row count underestimation that confusingly
went away when two inner joins in the from clause were reordered. I
whittled it down to a reproducible test case.

Schema:

CREATE TABLE small (id serial primary key, ref_id int not null, subset
int not null);
CREATE TABLE big (id serial primary key, small_id int not null);

INSERT INTO small (ref_id, subset) SELECT i/2+1, i/2+1 FROM
generate_series(1,1000) i;
INSERT INTO big (small_id) SELECT (i % 1000) + 1 FROM
generate_series(1,1000000) i;

CREATE INDEX ON small (ref_id);
CREATE INDEX ON big (small_id);

ANALYZE;

And the queries, differing in only the order of joins:

SELECT * FROM
      small
      INNER JOIN big ON small.id = big.small_id
      INNER JOIN (SELECT 1 UNION ALL SELECT 2) lookup(ref) ON
lookup.ref = small.ref_id
WHERE small.subset = 42;

SELECT * FROM
      small
      INNER JOIN (SELECT 1 UNION ALL SELECT 2) lookup(ref) ON
lookup.ref = small.ref_id
      INNER JOIN big ON small.id = big.small_id
WHERE small.subset = 42;

Resulting plan for the first case:
 Nested Loop  (cost=20.45..2272.13 rows=8 width=24)
   ->  Nested Loop  (cost=0.28..16.69 rows=1 width=16)
         ->  Append  (cost=0.00..0.04 rows=2 width=4)
               ->  Result  (cost=0.00..0.01 rows=1 width=4)
               ->  Result  (cost=0.00..0.01 rows=1 width=4)
         ->  Index Scan using small_ref_id_idx on small
(cost=0.28..8.32 rows=1 width=12)
               Index Cond: (ref_id = (1))
               Filter: (subset = 42)
   ->  Bitmap Heap Scan on big  (cost=20.18..2245.44 rows=1000 width=8)
         Recheck Cond: (small_id = small.id)
         ->  Bitmap Index Scan on big_small_id_idx  (cost=0.00..19.93
rows=1000 width=0)
               Index Cond: (small_id = small.id)

Second case plan is identical except row count of the topmost nest loop:
 Nested Loop  (cost=20.45..2272.13 rows=1000 width=24)

The union subselect was in reality somewhat more complicated, but for
the row count issue the simplification does not seem to matter. The
behavior is seen both on 9.4 and on master.

Does anybody have any idea what is going on here? In the real world
case this is based on the estimation was 5 rows instead of 200k, which
resulted in quite bad plan choices downstream.

Regards,
Ants Aasma

-- 
Sent via pgsql-performance mailing list (pgsql-performance@xxxxxxxxxxxxxx)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-performance