No, I don't want to use LEFT JOINS. I want to use regular joins. But (as mentioned in my other follow-up), all the fields are not null (was not in the original email, sorry), and are foreign keys, so it is guaranteed to always match. The key part (in my mind) is that I am not filtering on any of the useless tables, and I am not returning any columns from those tables either. Both is known at planning time. Or is my logic still broken? Thanks, Igor -----Original Message----- From: Kevin Grittner [mailto:kgrittn@xxxxxxxxx] Sent: Friday, July 01, 2016 1:29 PM To: Sfiligoi, Igor <Igor.Sfiligoi@xxxxxx> Cc: pgsql-general@xxxxxxxxxxxxxx Subject: --EXTERNAL--Re: PSQL does not remove obvious useless joins On Fri, Jul 1, 2016 at 12:17 PM, Sfiligoi, Igor <Igor.Sfiligoi@xxxxxx> wrote: > We have a view that is very generic, and we noticed that PostgreSQL is > not very good at removing useless joins, which makes our queries very slow. > > We could change our code to avoid the view and write ad-hoc queries to > the underlying tables, but would prefer not to, if there is a way around it. If it did not do the joins it could not provide the information you are actually asking to see. Of course, there is a very good chance that what you are asking to see is not what you *want* to see. test=# create table a (id int primary key, name varchar(128)); CREATE TABLE test=# create table b (id int primary key, name varchar(128)); CREATE TABLE test=# create table c (id int primary key, test(# a_id int references a(id), test(# b1_id int references b(id), test(# b2_id int references b(id), test(# b3_id int references b(id)); CREATE TABLE test=# test=# create view v_broken as test-# select test-# c.id, c.a_id, c.b1_id, c.b2_id, c.b3_id, a.name a_name, test-# b1.name b1_name, b2.name b2_name, b3.name b3_name test-# from c, a, b b1, b b2, b b3 test-# where c.a_id=a.id and c.b1_id=b1.id and c.b2_id=b2.id and c.b3_id=b3.id; CREATE VIEW test=# test=# create view v as test-# select test-# c.id, c.a_id, c.b1_id, c.b2_id, c.b3_id, a.name a_name, test-# b1.name b1_name, b2.name b2_name, b3.name b3_name test-# from c test-# left join a on a.id = c.a_id test-# left join b b1 on b1.id = c.b1_id test-# left join b b2 on b2.id = c.b2_id test-# left join b b3 on b3.id = c.b3_id; CREATE VIEW test=# test=# insert into a values (1, 'a1'); INSERT 0 1 test=# insert into b values (1, 'b1'), (2, 'b2'), (3, 'b3'); INSERT 0 3 test=# insert into c values (1, 1, 1, 2, 3), (2, 1, 1, 2, null); INSERT 0 2 test=# test=# select id, b1_name from v_broken; id | b1_name ----+--------- 1 | b1 (1 row) test=# explain analyze select id, b1_name from v_broken; QUERY PLAN ---------------------------------------------------------------------------------------------------------------------------- Hash Join (cost=63.40..183.90 rows=1700 width=278) (actual time=0.049..0.052 rows=1 loops=1) Hash Cond: (c.b3_id = b3.id) -> Hash Join (cost=47.55..144.68 rows=1700 width=282) (actual time=0.030..0.033 rows=2 loops=1) Hash Cond: (c.b2_id = b2.id) -> Hash Join (cost=31.70..105.45 rows=1700 width=286) (actual time=0.018..0.020 rows=2 loops=1) Hash Cond: (c.b1_id = b1.id) -> Hash Join (cost=15.85..66.22 rows=1700 width=16) (actual time=0.010..0.012 rows=2 loops=1) Hash Cond: (c.a_id = a.id) -> Seq Scan on c (cost=0.00..27.00 rows=1700 width=20) (actual time=0.001..0.001 rows=2 loops=1) -> Hash (cost=12.60..12.60 rows=260 width=4) (actual time=0.003..0.003 rows=1 loops=1) Buckets: 1024 Batches: 1 Memory Usage: 9kB -> Seq Scan on a (cost=0.00..12.60 rows=260 width=4) (actual time=0.002..0.003 rows=1 loops=1) -> Hash (cost=12.60..12.60 rows=260 width=278) (actual time=0.005..0.005 rows=3 loops=1) Buckets: 1024 Batches: 1 Memory Usage: 9kB -> Seq Scan on b b1 (cost=0.00..12.60 rows=260 width=278) (actual time=0.002..0.003 rows=3 loops=1) -> Hash (cost=12.60..12.60 rows=260 width=4) (actual time=0.006..0.006 rows=3 loops=1) Buckets: 1024 Batches: 1 Memory Usage: 9kB -> Seq Scan on b b2 (cost=0.00..12.60 rows=260 width=4) (actual time=0.004..0.004 rows=3 loops=1) -> Hash (cost=12.60..12.60 rows=260 width=4) (actual time=0.013..0.013 rows=3 loops=1) Buckets: 1024 Batches: 1 Memory Usage: 9kB -> Seq Scan on b b3 (cost=0.00..12.60 rows=260 width=4) (actual time=0.008..0.009 rows=3 loops=1) Planning time: 0.729 ms Execution time: 0.153 ms (23 rows) test=# select id, b1_name from v; id | b1_name ----+--------- 1 | b1 2 | b1 (2 rows) test=# explain analyze select id, b1_name from v; QUERY PLAN --------------------------------------------------------------------------------------------------------------- Hash Left Join (cost=15.85..66.22 rows=1700 width=278) (actual time=0.017..0.018 rows=2 loops=1) Hash Cond: (c.b1_id = b1.id) -> Seq Scan on c (cost=0.00..27.00 rows=1700 width=20) (actual time=0.005..0.005 rows=2 loops=1) -> Hash (cost=12.60..12.60 rows=260 width=278) (actual time=0.006..0.006 rows=3 loops=1) Buckets: 1024 Batches: 1 Memory Usage: 9kB -> Seq Scan on b b1 (cost=0.00..12.60 rows=260 width=278) (actual time=0.002..0.003 rows=3 loops=1) Planning time: 0.177 ms Execution time: 0.044 ms (8 rows) Note the difference in results using inner joins versus left outer joins. -- Kevin Grittner EDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company -- Sent via pgsql-general mailing list (pgsql-general@xxxxxxxxxxxxxx) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-general