Search Postgresql Archives

Re: Merge join vs merge semi join against primary key

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On 10 October 2015 at 08:52, Sean Rhea <sean.c.rhea@xxxxxxxxx> wrote:

1. When I join two tables with "WHERE id IN (...)" versus with an explicit
join, and the join column for the inner table is a primary key, I would expect
the same behavior in both cases, but the optimizer is choosing a merge join in
one case and a merge semi join in the other. There's at most one customer
with a given id. Why not do a semi join?


Unfortunately the 9.2 planner does not make any checks to verify that customers.id is unique to perform a semi join. There is a pending patch in the 9.6 cycle to add this optimisation. 
 
production=> select count(*) from customers;
 count
--------
 473733
(1 row)

...
 
   ->  Index Scan using customers_pkey on customers o  (cost=0.00..63836.61 rows=836 width=8) (actual time=159.840..201915.765 rows=7672 loops=1)
         Filter: (group_id = 45)
         Rows Removed by Filter: 212699113


Rows Removed by Filter: 212699113 seems to indicate that your 473733 row count for "customers" is incorrect. 

If you're doing lots of filtering on group_id, then perhaps you should think about adding an index on customers (group_id,id)


--
 David Rowley                   http://www.2ndQuadrant.com/
 PostgreSQL Development, 24x7 Support, Training & Services

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
[Index of Archives]     [Postgresql Jobs]     [Postgresql Admin]     [Postgresql Performance]     [Linux Clusters]     [PHP Home]     [PHP on Windows]     [Kernel Newbies]     [PHP Classes]     [PHP Books]     [PHP Databases]     [Postgresql & PHP]     [Yosemite]
  Powered by Linux